Regular expression to match not the beginning/end of a line - regex

I would like a regular expression to match only " that
don't come at the start of a line or after white space at the start of a line
don't come at the end of a line or before white space at the end of a line
I guess I need to use lookbehind and lookahead.
So matches the " in
zfgjhsgaf jhsa gd " gjhygf" hgf
But not in
"gjhgjkgjhgjhgkk"
"dfsdfsdf"

For Eclipse, try finding by this regex:
(?<!^\s*)"(?!\s*$)
And replacing with:
\"

See this here
(?<!^)"(?!\s*$)
at Regexr
It works not for the whitespace after beginning of the line. As BoltClock mentioned, variable length look behind is supported only by few engines (I know only .net).
If you use a regex that support it, you can use
(?<!^.*)"(?!\s*$)
A good documentation for look ahead/behind is here in the perldoc.perl.org/perlretut.html#Looking-ahead-and-looking-behind

^\s*"?.*\S.*(").*?\S.*?"?\s*$
Which supports matching ' "foo"bar" ' assuming that is something that you want to find.
Oh, and it only matches if $1 is set

This one should work
^\s*[^"].*".*[^"]\s*$

I think whis re is expressive enougth :
^\s*\S+.*innertext.*\S+\s*$

Related

Regex: Exact match string ending with specific character

I'm using Java. So I have a comma separated list of strings in this form:
aa,aab,aac
aab,aa,aac
aab,aac,aa
I want to use regex to remove aa and the trailing ',' if it is not the last string in the list. I need to end up with the following result in all 3 cases:
aab,aac
Currently I am using the following pattern:
"aa[,]?"
However it is returning:
b,c
If lookarounds are available, you can write:
,aa(?![^,])|(?<![^,])aa,
with an empty string as replacement.
demo
Otherwise, with a POSIX ERE syntax you can do it with a capture:
^(aa(,|$))+|(,aa)+(,|$)
with the 4th group as replacement (so $4 or \4)
demo
Without knowing your flavor, I propose this solution for the case that it does know the \b.
I use perl as demo environment and do a replace with "_" for demonstration.
perl -pe "s/\baa,|,aa\b/_/"
\b is the "word border" anchor. I.e. any start or end of something looking like a word. It allows to handle line end, line start, blank, comma.
Using it, two alternatives suffice to cover all the cases in your sample input.
Output (with interleaved input, with both, line ending in newline and line ending in blank):
aa,aab,aac
_aab,aac
aab,aa,aac
aab_,aac
aab,aac,aa
aab,aac_
aa,aab,aac
_aab,aac
aab,aa,aac
aab_,aac
aab,aac,aa
aab,aac_
If the \b is unknown in your regex engine, then please state which one you are using, i.e. which tool (e.g. perl, awk, notepad++, sed, ...). Also in that case it might be necessary to do replacing instead of deleting, i.e. to fine tune a "," or "" as replacement. For supporting that, please show the context of your regex, i.e. the replacing mechanism you are using. If you are deleting, then please switch to replacing beforehand.
(I picked up an input from comment by gisek, that the cpaturing groups are not needed. I usually use () generously, including in other syntaxes. In my opinion not having to think or look up evaluation orders is a benefit in total time and risks taken. But after testing, I use this terser/eleganter way.)
If your regex engine supports positive lookaheads and positive lookbehinds, this should work:
,aa(?=,)|(?<=,)aa,|(,|^)aa(,|$)
You could probably use the following and replace it by nothing :
(aa,|,aa$)
Either aa, when it's in the begin or the middle of a string
,aa$ when it's at the end of the string
Demo
As you want to delete aa followed by a coma or the end of the line, this should do the trick: ,aa(?=,|$)|^aa,
see online demo

Search all words that starts with $(' or with $(" but after that ther is no '# 'or '.'

Find with Regular expression, in Visal Studio search:
All words that starts with $(' or with $(" but after that ther is no '#' or '.'
So it should not find:
$('#id1')
$('.class')
$(".class")
But it should find:
$('div')
$("span")
I try with this. But not working:
$/(['"][^#/.].*
\$\(((?!#|\.).)*\)
This should work for your cause.
in your regex you need
1)to escape $ by \$
2)quanitfy your character class. ie. [^#/.]* and not [^#/.].* which just check for the first charcter after (" and then allows # or ..
So your regex would be
\$\(['"][^#.]*
You also dont need to escape . in a charcter class.
See demo.
http://regex101.com/r/sU3fA2/54
Use lookarounds like below,
(?<=\s|^)\$\(['"](?![#.])\S+
OR
(?<=\s|^)\$\(['"](?![#.])[^()]*\)
OR
(?<=\s|^)\$\((['"])(?![#.])(?:(?!\1).)*\1\)
This won't match the wrong formats like $("foo')
DEMO

Trouble converting regex

This regex:
"REGION\\((.*?)\\)(.*?)END_REGION\\((.*?)\\)"
currently finds this info:
REGION(Test) my user typed this
END_REGION(Test)
I need it to instead find this info:
#region REGION my user typed this
#endregion END_REGION
I have tried:
"#region\\ (.*?)\\\n(.*?)#endregion\\ (.*?)\\\n"
It tells me that the pattern assignment has failed. Can someone please explain what I am doing wrong? I am new to Regex.
It seems the issue lies in the multiline \n. My recommendation is to use the modifier s to avoid multiline complexities like:
/#region\ \(.*?\)(.*?)\s#endregion\s\(.*?\)/s
Online Demo
s modifier "single line" makes the . to match all characters, including line breaks.
Try this:
#region(.*)?\n(.*)?#endregion(.*)?
This works for me when testing here: http://regexpal.com/
When using your original text and regex, the only thing that threw it off is that I did not have a new line at the end because your sample text didn't have one.
Constructing this regex doesn't fail using boost, even if you use the expanded modifier.
Your string to the compiler:
"#region\\ (.*?)\\\n(.*?)#endregion\\ (.*?)\\\n"
After parsed by compiler:
#region\ (.*?)\\n(.*?)#endregion\ (.*?)\\n
It looks like you have one too many escapes on the newline.
if you present the regex as expanded to boost, an un-escaped pound sign # is interpreted as a comment.
In that case, you need to escape the pound sign.
\#region\ (.*?)\\n(.*?)\#endregion\ (.*?)\\n
If you don't use the expanded modifier, then you don't need to escape the space characters.
Taking that tack, you can remove the escape on the space's, and fixing up the newline escapes, it looks like this raw (what gets passed to regex engine):
#region (.*?)\n(.*?)#endregion (.*?)\n
And like this as a source code string:
"#region (.*?)\\n(.*?)#endregion (.*?)\\n"
Your regular expression has an extra backslash when escaping the newline sequence \\\n, use \\s* instead. Also for the last capturing group you can use a greedy quantifier instead and remove the newline sequence.
#region\\ (.*?)\\s*(.*?)#endregion\\ (.*)
Compiled Demo

Very simple regex

I need a regex to match something this:
<a space><any character/s>#<any character/s><a space>
Yes, it's a very very basic email parser.
Thanks!
Something like this? /^ [^#]+#[^ ]+ $/
The square brackets indicate a character class, which is the characters that can be present there. So, your regex would match .#. or *#*. Instead, try "\ .*#.*\ " (quotes to show the space at the end, don't include them inside your regex.
For testing e-mail, you might use the regex described here:
\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b
It still doesn't cover 100% of e-mails, but the comprehensive version is fairly involved.
^ .+#.+ $
This translates to "the start of the string is followed by a space, one or more characters, the # symbol, one or more characters, and the last character in the string is a space."

Little vim regex

I have a bunch of strings that look like this: '../DisplayPhotod6f6.jpg?t=before&tn=1&id=130', and I'd like to take out everything after the question mark, to look like '../DisplayPhotod6f6.jpg'.
s/\(.\.\.\/DisplayPhoto.\{4,}\.jpg\)*'/\1'/g
This regex is capturing some but not all occurences, can you see why?
\.\{4,} is trying to match 4 or more . characters. What it looks like you wanted is "match 4 or more of any character" (.\{4,}) but "match 4 or more non-. characters" ([^.]\{4,}) might be more accurate. You'll also need to change the lone * at the end of the pattern to .* since the * is currently applying to the entire \(\) group.
I think the easyest way to go for this is:
s/?.*$/'/g
This says: delete everything after the question mark and replace it with a single quote.
I would use macros, sometime simpler than regexp (and interactive) :
qa
/DisplayPhoto<Enter>
f?dt'
n
q
And then some #a, or 20000#a to go though all lines.
The following regexp: /(\.\./DisplayPhoto.*\.jpg)/gi
tested against following examples:
../DisplayPhotocef3.jpg?t=before&tn=1&id=54
../DisplayPhotod6f6.jpg?t=before&tn=1&id=130
will result:
../DisplayPhotocef3.jpg
../DisplayPhotod6f6.jpg
%s/\('\.\.\/DisplayPhoto\w\{4,}\.jpg\).*'/\1'/g
Some notes:
% will cause the swap to work on all lines.
\w instead of '.', in case there are some malformed file names.
Replace '.' at the start of your matching regex with ' which is exactly what it should be matching.