I want to find from a list of strings (all captial letters) that contains "NAME", but before and after the name I don't want any characters. So I tried this regex "[^A-Z]NAME[^A-Z]. But the string that are like "NAME", or "NAME " can not be matched, I thought the [^A-Z] just check as long they are not in these character, and nothing would also be OK. Did I miss something here?
Chris
Try using a word boundry:
\bNAME\b
Here's a demo: http://regexr.com?33f1d
Related
all. I have spent some time now to learn regular expression, but eventually there is a problem I cannot solve properly.
Lets assume the following 'string' (html-extract):
"{'2018-05-02', '2018-01-05', r, '2018-07-01', '2017-07-02', '2016-07-31' random_text XYCCC Letters and 55565798 ]}"
My intention is, to extract all values from '2018-05-02' ... to (and excluding) random_text. I tried to achieve this through chosing the "anything but" structure to achieve this [^a] (not a):
\'[^random]*
The above does not do the job, because random is not a string, but a set of characters, hence the 'r' in the string will split my extracted value.
If there is no r in the text before the word random_text, this would work fine:
\'[^r]*
Is there any way to include a specific string as the end of my sequence. e.g.
start: \'
repeated characters unlike string: [^{my_string}]*
Appreciate any insight :)
This regex will do the job:
'.+'(?= random)
Just replace random with the string you want to exclude at the end.
Demo & explanation
I want to split a text into it's single words using regular expressions. The obvious solution would be to use the regex \\b unfortunately this one does split words also on the hyphen.
So I am searching an expression doing exactly the same as the \\b but does not split on hyphens.
Thanks for your help.
Example:
String s = "This is my text! It uses some odd words like user-generated and need therefore a special regex.";
String [] b = s.split("\\b+");
for (int i = 0; i < b.length; i++){
System.out.println(b[i]);
}
Output:
This
is
my
text
!
It
uses
some
odd
words
like
user
-
generated
and
need
therefore
a
special
regex
.
Expected output:
...
like
user-generated
and
....
#Matmarbon solution is already quite close, but not 100% fitting it gives me
...
like
user-
generated
and
....
This should do the trick, even if lookaheads are not available:
[^\w\-]+
Also not you but somebody who needs this for another purpose (i.e. inserting something) this is more of an equivalent to the \b-solutions:
([^\w\-]|$|^)+
because:
There are three different positions that qualify as word boundaries:
Before the first character in the string, if the first character is a word character.
After the last character in the string, if the last character is a word character.
Between two characters in the string, where one is a word character and the other is not a word character.
--- http://www.regular-expressions.info/wordboundaries.html
You can use this:
(?<!-)\\b(?!-)
It is hard to find. I need to write lexer and tokenizer for it.
I've got a problem in finding a regex which matches variable names but not string values.
The following should not be matched:
"ala ma kota"
5aalaas
This should be matched:
ala_ma_KOTA999653
l90
a
I already got something like this:
[a-zA-z]\w+
but I don't know how to exclude " chars from the beginning and end of a match.
Thanks for any reply or google links (I couldn't find it - it can be from lmgify ;)).
I interpret variable names as all word character sequences with a min length of 1 and starting with a letter. Your regexp was almost correct then:
^[A-Za-z]\w*$
I am trying to create a simple matcher that matches any string consisting of alphanumeric characters. I tried the following:
Ext.regModel('RegistrationData', {
fields: [
{name: 'nickname',type: 'string'},
],
validations: [
{type: 'format', name: 'nickname', matcher: /[a-zA-Z0-9]*/}
]
});
However this does not work as expected. I did not find any documentation on how should a regular expression in a matcher look like.
Thank you for help.
I found a blog on sencha.com, where they explain the validation.
I have no idea what sencha-touch is, but maybe it helps, when you tell us what you are giving to your regex, what you expect it to do, and what it actually does (does not work as expected is a bit vague). According to the blog it accepts "regular expression format", so for your simple check, it should be pretty standard.
EDIT:
As a wild guess, maybe you want to use anchors to ensure that the name has really only letters and numbers:
/^[a-zA-Z0-9]*$/
^ is matching the start of the string
$ is matching the end of the string
Your current regex /[a-zA-Z0-9]*/ would match a string containing zero or more occurrences of lower or upper case characters (A-Z) or numbers anywhere in the string. That's why Joe#2, J/o/e, *89Joe as well as Joe, Joe24andjOe28` match - they all contain zero or more subsequent occurrences of the respective characters.
If you want your string to contain only the respective characters you have to change the regex according to stema's answer:
/^[a-zA-Z0-9]*$/
But this has still one problem. Due to the * which meas zero or more occurrences it also matches an empty string, so the correct string should be:
/^[a-zA-Z0-9]+$/
with + meaning one or more occurrences. This will allow nicknames containing only one lowercase or uppercase character or number, such as a, F or 6.
Anyone know why this is happening:
Filename: 031\_Lobby.jpg
RegExp: (\d+)\_(.*)[^\_e|\_i]\.jpg
Replacement: \1\_\2\_i.jpg
That produces this:
031\_Lobb\_i.jpg
For some reason it's chopping the last character from the second back-
reference (the "y" in "Lobby". It doesn't do that when I remove the [^_e|_i] so I must be doing something wrong that's related to that.
Thanks!
You force it to chop off the last character with this part of your regex:
[^_e|_i]
Which translates as: Any single character except "_", "e", "|", "i".
The "y" in "Lobby" matches this criterion.
You mean "not _e" and "not _i", obviously, but that's not the way to express it. This would be right:
(\d+)_(.+)(?<!_[ei])\.jpg
Note that the dot needs to be escaped in regular expressions.
it is removing the "y" because [^_e|_i] matches the y, and the .* matches everything before the y.
You're forcing it to have a last character different from _e and _i. You should use this instead (note the last *):
(\d+)_(.*)[^_e|_i]*.jpg