Very simple regex - regex

I need a regex to match something this:
<a space><any character/s>#<any character/s><a space>
Yes, it's a very very basic email parser.
Thanks!

Something like this? /^ [^#]+#[^ ]+ $/

The square brackets indicate a character class, which is the characters that can be present there. So, your regex would match .#. or *#*. Instead, try "\ .*#.*\ " (quotes to show the space at the end, don't include them inside your regex.

For testing e-mail, you might use the regex described here:
\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b
It still doesn't cover 100% of e-mails, but the comprehensive version is fairly involved.

^ .+#.+ $
This translates to "the start of the string is followed by a space, one or more characters, the # symbol, one or more characters, and the last character in the string is a space."

Related

i need a regular field validator that accepts alphabets numbers `-. and only one space between words

I have tried using ^[a-zA-Z0-9 `.]*$ . But it is allowing more spaces.And can anyone please explain what is "closed" in this context? Any help is appreciated.
Try this:
/^[a-z0-9\-\.`]+\s{0,1}[a-z0-9\-\.`]+$/gmi
Regex live here.
Explaining:
^[a-z0-9\-\.\`]+ # starts with at least one letter/number/-/./`
\s{0,1} # must or not contain one space - same as: '\s?'
[a-z0-9\-\.\`]+$ # ends with at least one letter/number/-/./`
This one should do a pretty good job:
/*#!(?#!js valid Rev:20150715_1300)
# Validate alphabets numbers `-. and only one space.
^ # Anchor to start of string.
(?=[^ ]+(?:[ ][^ ]+)*$) # Only one space between words.
[a-zA-Z0-9 `.-]* # One or more allowed chars.
$ # Anchor to end of string.
!#*/
var valid = /^(?=[^ ]+(?: [^ ]+)*$)[a-zA-Z0-9 `.-]*$/;
Your braces have a space in them and are also at the beginning of your regex after the carrot. so you need to exclude spaces at the beginning and end of text:
/^([a-z0-9\-]+\s{0,1}[a-z0-9\-]+)+$/gmi
you also want to include the '-' character by escaping it and including it.
https://regex101.com/
A nice website for testing regex
What I am interpreting from your question would look something like this.
^([a-zA-Z0-9`.]+ ?)*[a-zA-Z0-9`.]+$
It means that for every space in your string, there must be a series of the characters you suggested, and it must end with at least one of those characters as well.

Trouble converting regex

This regex:
"REGION\\((.*?)\\)(.*?)END_REGION\\((.*?)\\)"
currently finds this info:
REGION(Test) my user typed this
END_REGION(Test)
I need it to instead find this info:
#region REGION my user typed this
#endregion END_REGION
I have tried:
"#region\\ (.*?)\\\n(.*?)#endregion\\ (.*?)\\\n"
It tells me that the pattern assignment has failed. Can someone please explain what I am doing wrong? I am new to Regex.
It seems the issue lies in the multiline \n. My recommendation is to use the modifier s to avoid multiline complexities like:
/#region\ \(.*?\)(.*?)\s#endregion\s\(.*?\)/s
Online Demo
s modifier "single line" makes the . to match all characters, including line breaks.
Try this:
#region(.*)?\n(.*)?#endregion(.*)?
This works for me when testing here: http://regexpal.com/
When using your original text and regex, the only thing that threw it off is that I did not have a new line at the end because your sample text didn't have one.
Constructing this regex doesn't fail using boost, even if you use the expanded modifier.
Your string to the compiler:
"#region\\ (.*?)\\\n(.*?)#endregion\\ (.*?)\\\n"
After parsed by compiler:
#region\ (.*?)\\n(.*?)#endregion\ (.*?)\\n
It looks like you have one too many escapes on the newline.
if you present the regex as expanded to boost, an un-escaped pound sign # is interpreted as a comment.
In that case, you need to escape the pound sign.
\#region\ (.*?)\\n(.*?)\#endregion\ (.*?)\\n
If you don't use the expanded modifier, then you don't need to escape the space characters.
Taking that tack, you can remove the escape on the space's, and fixing up the newline escapes, it looks like this raw (what gets passed to regex engine):
#region (.*?)\n(.*?)#endregion (.*?)\n
And like this as a source code string:
"#region (.*?)\\n(.*?)#endregion (.*?)\\n"
Your regular expression has an extra backslash when escaping the newline sequence \\\n, use \\s* instead. Also for the last capturing group you can use a greedy quantifier instead and remove the newline sequence.
#region\\ (.*?)\\s*(.*?)#endregion\\ (.*)
Compiled Demo

Regex to match everything before two forward-slashes (//) not contained in quotes

I've been grappling with some negative lookahead and lookbehind patterns to no avail. I need a regex that will match everything in a string before two forward slashes, unless said characters are in quotes.
For example, in the string "//this is a string//" //A comment about a string about strings,
the substring "//this is a string//" ought to be matched and the rest ignored.
As you can see, the point is to exclude any single-line comments (C++/Java style).
Thanks in advanced.
Here you go:
^([^/"]|".+?"|/[^/"])*
How about
\/\/[^\"']*$
It will match // if it is not followed by either a " or a '. It's not exactly what you requested, but closely meets your requirements. It will only choke on comments that contain " or ', like
// I like "bread".
Maybe better than no solution.
A python/regex based comment remover I wrote a while back, if it's helpful:
def remcomment(line):
for match in re.finditer('"[^"]*"|(//)', line):
if match.group(1):
return line[:match.start()].rstrip()
return line

Regular expression to match not the beginning/end of a line

I would like a regular expression to match only " that
don't come at the start of a line or after white space at the start of a line
don't come at the end of a line or before white space at the end of a line
I guess I need to use lookbehind and lookahead.
So matches the " in
zfgjhsgaf jhsa gd " gjhygf" hgf
But not in
"gjhgjkgjhgjhgkk"
"dfsdfsdf"
For Eclipse, try finding by this regex:
(?<!^\s*)"(?!\s*$)
And replacing with:
\"
See this here
(?<!^)"(?!\s*$)
at Regexr
It works not for the whitespace after beginning of the line. As BoltClock mentioned, variable length look behind is supported only by few engines (I know only .net).
If you use a regex that support it, you can use
(?<!^.*)"(?!\s*$)
A good documentation for look ahead/behind is here in the perldoc.perl.org/perlretut.html#Looking-ahead-and-looking-behind
^\s*"?.*\S.*(").*?\S.*?"?\s*$
Which supports matching ' "foo"bar" ' assuming that is something that you want to find.
Oh, and it only matches if $1 is set
This one should work
^\s*[^"].*".*[^"]\s*$
I think whis re is expressive enougth :
^\s*\S+.*innertext.*\S+\s*$

Regex - Multiline Problem

I think I'm burnt out, and that's why I can't see an obvious mistake. Anyway, I want the following regex:
#BIZ[.\s]*#ENDBIZ
to grab me the #BIZ tag, #ENDBIZ tag and all the text in between the tags. For example, if given some text, I want the expression to match:
#BIZ
some text some test
more text
maybe some code
#ENDBIZ
At the moment, the regex matches nothing. What did I do wrong?
ADDITIONAL DETAILS
I'm doing the following in PHP
preg_replace('/#BIZ[.\s]*#ENDBIZ/', 'my new text', $strMultiplelines);
The dot loses its special meaning inside a character class — in other words, [.\s] means "match period or whitespace". I believe what you want is [\s\S], "match whitespace or non-whitespace".
preg_replace('/#BIZ[\s\S]*#ENDBIZ/', 'my new text', $strMultiplelines);
Edit: A bit about the dot and character classes:
By default, the dot does not match newlines. Most (all?) regex implementations have a way to specify that it match newlines as well, but it differs by implementation. The only way to match (really) any character in a compatible way is to pair a shorthand class with its negation — [\s\S], [\w\W], or [\d\D]. In my personal experience, the first seems to be most common, probably because this is used when you need to match newlines, and including \s makes it clear that you're doing so.
Also, the dot isn't the only special character which loses its meaning in character classes. In fact, the only characters which are special in character classes are ^, -, \, and ]. Check out the "Metacharacters Inside Character Classes" section of the character classes page on Regular-Expressions.info.
// Replaces all of your code with "my new text", but I do not think
// this is actually what you want based on your description.
preg_replace('/#BIZ(.+?)#ENDBIZ/s', 'my new text', $contents);
// Actually "gets" the text, which is what I think you might be looking for.
preg_match('/(#BIZ)(.+?)(#ENDBIZ)/s', $contents, $matches);
list($dummy, $startTag, $data, $endTag) = $matches;
This should work
#BIZ[\s\S]*#ENDBIZ
You can try this online Regular Expression Testing Tool
The mistake is the character group [.\s] that will match a dot (not any character) or white space. You probably tried to get .* with . matching newline characters, too. You achieve this by enabling the single line option ((?s:) does this in .NET regex).
(?s:#BIZ.*?#ENDBIZ)
Depending on the environment you're using your regex in, it may need special care to properly parse multiline text, eg re.DOTALL in Python. So what environment is that?
you can use
preg_replace('/#BIZ.*?#ENDBIZ/s', 'my new text', $strMultiplelines);
the 's' modifier says "match the dot with anything, even the newline character". the '?' says don't be greedy, such as for the case of:
foo
#BIZ
some text some test
more text
maybe some code
#ENDBIZ
bar
#BIZ
some text some test
more text
maybe some code
#ENDBIZ
hello world
the non-greediness won't get rid of the "bar" in the middle.
Unless I am missing something, you handle this the same way that you would in Perl, with either the /m or /s modifier at the end? Oddly enough the other answers that rather correctly pointed this out got down voted?!
It looks like you're doing a javascript regex, you'll need to enable multiline by specifying the m flag at the end of the expression:
var re = /^deal$/mg