If I have the following data:
"test1"."test2" AND "test1"."test2"
What regex can I use to match "test1"."test2"?
I tried the following but it did not work.
\b"test1"."test2"(\s+|$)
In the given example I'd like to match "test1"."test2", and, "test1"."test2"
\b matches at a word boundary, i. e. just before or after an alphanumeric character. Since " is not alphanumeric (and assuming that there is no word character right before it), the assertion fails - and therefore the entire regex.
Drop the \b, escape the dot, and you're set.
This should work
"test1"\."test2"
Related
Hello good afternoon!!
I'm new to the world of regular expressions and would like some help creating the following expression!
I have a query that returns the following values:
caixa-pod
config-pod
consultas-pod
entregas-pod
monitoramento-pod
vendas-pod
I would like the results to be presented as follows:
caixa
config
consultas
entregas
monitoramento
vendas
In this case, it would exclude the word "-pod" from each value.
I would try (.*)-pod. It is not clear, where do you want to use that regexp (so regexp can be different). I guess it is dashboard variable.
You can try
\b[a-z]*(?=-pod)\b
This regex basically tells the regex engine to match
\b a word boundary
[a-z]* any number of lowercase characters in range a-z (feel free to extend to whatever is needed e.g. [a-zA-Z0-9] matches all alphanumeric characters)
(?=-pod) followed by -pod but exclude that from the result (positive lookahead)
\b another word boundary
\b matches a word boundary position between a word character and non-word character or position (start / end of string).
I am having problems creating a regex validator that checks to make sure the input has uppercase or lowercase alphabetical characters, spaces, periods, underscores, and dashes only. Couldn't find this example online via searches. For example:
These are ok:
Dr. Marshall
sam smith
.george con-stanza .great
peter.
josh_stinson
smith _.gorne
Anything containing other characters is not okay. That is numbers, or any other symbols.
The regex you're looking for is ^[A-Za-z.\s_-]+$
^ asserts that the regular expression must match at the beginning of the subject
[] is a character class - any character that matches inside this expression is allowed
A-Z allows a range of uppercase characters
a-z allows a range of lowercase characters
. matches a period
rather than a range of characters
\s matches whitespace (spaces and tabs)
_ matches an underscore
- matches a dash (hyphen); we have it as the last character in the character class so it doesn't get interpreted as being part of a character range. We could also escape it (\-) instead and put it anywhere in the character class, but that's less clear
+ asserts that the preceding expression (in our case, the character class) must match one or more times
$ Finally, this asserts that we're now at the end of the subject
When you're testing regular expressions, you'll likely find a tool like regexpal helpful. This allows you to see your regular expression match (or fail to match) your sample data in real time as you write it.
Check out the basics of regular expressions in a tutorial. All it requires is two anchors and a repeated character class:
^[a-zA-Z ._-]*$
If you use the case-insensitive modifier, you can shorten this to
^[a-z ._-]*$
Note that the space is significant (it is just a character like any other).
i need a regex that matches an expression ending with a word boundary, but which does not consider the hyphen as a boundary.
i.e. get all expressions matched by
type ([a-z])\b
but do not match e.g.
type a-1
to rephrase: i want an equivalent of the word boundary operator \b which instead of using the word character class [A-Za-z0-9_], uses the extended class: [A-Za-z0-9_-]
You can use a lookahead for this, the shortest would be to use a negative lookahead:
type ([a-z])(?![\w-])
(?![\w-]) would mean "fail the match if the next character is in \w or is a -".
Here is an option that uses a normal lookahead:
type ([a-z])(?=[^\w-]|$)
You can read (?=[^\w-]|$) as "only match if the next character is not in the character class [\w-], or this is the end of the string".
See it working: http://www.rubular.com/r/NHYhv72znm
I had a pretty similar problem except I didn't want to consider the '*' as a boundary character. Here's what I did:
\b(?<!\*)([^\s\*]+)\b(?!*)
Basically, if you're at a word boundary, look back one character and don't match if the previous character was an '*'. If you're in the middle, don't match on a space or asterisk. If you're at the end, make sure the end isn't an asterisk. In your case, I think you could use \w instead of \s. For me, this worked in these situations:
*word
wo*rd
word*
I have this regular expression
([A-Z], )*
which should match something like
test, (with a space after the comma)
How to I change the regex expression so that if there are any characters after the space then it doesn't match.
For example if I had:
test, test
I'm looking to do something similar to
([A-Z], ~[A-Z])*
Cheers
Use the following regular expression:
^[A-Za-z]*, $
Explanation:
^ matches the start of the string.
[A-Za-z]* matches 0 or more letters (case-insensitive) -- replace * with + to require 1 or more letters.
, matches a comma followed by a space.
$ matches the end of the string, so if there's anything after the comma and space then the match will fail.
As has been mentioned, you should specify which language you're using when you ask a Regex question, since there are many different varieties that have their own idiosyncrasies.
^([A-Z]+, )?$
The difference between mine and Donut is that he will match , and fail for the empty string, mine will match the empty string and fail for ,. (and that his is more case-insensitive than mine. With mine you'll have to add case-insensitivity to the options of your regex function, but it's like your example)
I am not sure which regex engine/language you are using, but there is often something like a negative character groups [^a-z] meaning "everything other than a character".
I have a regular expression to escape all special characters in a search string. This works great, however I can't seem to get it to work with word boundaries. For example, with the haystack
add +
or
add (+)
and the needle
+
the regular expression /\+/gi matches the "+". However the regular expression /\b\+/gi doesn't. Any ideas on how to make this work?
Using
add (plus)
as the haystack and /\bplus/gi as the regex, it matches fine. I just can't figure out why the escaped characters are having problems.
\b is a zero-width assertion: it doesn't consume any characters, it just asserts that a certain condition holds at a given position. A word boundary asserts that the position is either preceded by a word character and not followed by one, or followed by a word character and not preceded by one. (A "word character" is a letter, a digit, or an underscore.) In your string:
add +
...there's a word boundary at the beginning because the a is not preceded by a word character, and there's one after the second d because it's not followed by a word character. The \b in your regex (/\b\+/) is trying to match between the space and the +, which doesn't work because neither of those is a word character.
Try changing it to:
/\b\s?+/gi
Edit:
Extend this concept as far as you want. If you want the first + after any word boundary:
/\b[^+]*+/gi
Boundaries are very conditional assertions; what they anchor depends on what they touch. See this answer for a detailed explanation, along with what else you can do to deal with it.