Elasticsearch - Match hex number of fixed number of digits - regex

I have been trying to match using query_string and wildcard to exclude some values in my data.
I have values of the following type among others:
qa4689f54ad-XYXY
So the value starts with a ‘q’, then I have a hex number of 10-digits, followed by a hyphen and then the rest.
I tried the obvious q[a-fA-F0-9]{10}* expression (with the escape \) but it doesn’t match!
When I try the same regular expression on regex tester websites it matches perfectly.
I have gone thru maybe 10 questions related to regex in Elasticsearch but in vain.
Can someone please help? Thanks.

{10}* is not a valid construct in regular expressions.
You mean:
q[a-fA-F0-9]{10}.*
or (to make sure the hyphen is there):
q[a-fA-F0-9]{10}-.*
or (to make sure the match occurs at the start of the string)
^q[a-fA-F0-9]{10}-.*

Related

regular expression which can treat a string containing '#' as illegal input

I wrote a regular expression (https?:\/\/)+([a-x]*)?.[a-z]*.(com|io|cn|net) that can achieve:
Must start with http or https
Must end with com,cn,io or net
Domain names can only consist of numbers, letters, and underscores
Subdomain can be empty
the right answer can be 'http://123.cn' or 'https://www.123.cn'
but it also considered 'http://ww#.123.com' as the correct answer,
I wonder what's wrong with my expression, how to limit input '#'.
If you use a RegEx tester online (like regex101.com) it will tell you that it's matching because the . is not escaped as \. so it will match the # character.
Try: ^(https?:\/\/)([a-z0-9_]*\.)?[a-z0-9_]*\.(com|io|cn|net)$ and you may get what you're looking for.
Note your original RegEx did not include digits or the underscore in the domain names.

trying to find the correct regular expression

I have the following cases that should match with a regular expression, I've tried several combinations and have read a lot of answers but still no clue on how to solve it.
the rule is, find any combination of . inside a quoted string, atm I have the following regexp
\"\w*((..)|(.))\w*\"
that covers most of the cases:
mmmas"A.F"asdaAA
196.34.45.."asd."#
".add"
sss"a.aa"sss
".."
"a.."
"a..a"
"..A"
but still having problems with this one:
"WERA.HJJ..J"
I've been testing the regpexp in the http://regexr.com/ site
I will really appreciate any help on this
Change your regex to
\"\w*(\.+\w*)+\"
Update: escape . to match the dot and not any character
demo
From the question, it seems that you need to find every occurrence of one or more dot (along with optional word characters) inside a pair of quotes. The following regex would do this:
\"\w*(\.+\w*)+\"
In "WERA.HJJ..J", you have some word characters followed by a dot which is followed by a sequence of word characters again followed by dot and word characters. Your regex would match one or two dots with a pair of optional word character blocks on either sides only.
The dots in the regex are escaped to avoid them being matched against any character, since it is a metacharacter.
Check here.

regex for expression match containing a word but not period

I am on a quick project cannot learn regex at the moment so need some help. I know its too basic. Please tell me regex that matches expression containing a "becoming_*" but not those containing period. For example
Matchs following expressions:
becoming_1
becoming_2
becoming_20
But does not match
becoming_1.1
becoming_1.5
becoming_2.1
becoming_20.1
becoming_20.50
You could try the below regex to match the lines which has the string becoming_ followed by an integer number,
^becoming_\d+$
OR
^becoming_[0-9]+$
DEMO
You want a string that ends with numbers and not including a period - that's why we are using \d+ - for numbers, and $ - that nothing should be after that.
/becoming_\d+$/
Try a negative lookahead with a boundary
(becoming_\d+\b(?!\.))
You did not tag the question with a specific language, so I am not sure which dialect you are using.

Weird in a regular expression

I tried the following regular expression:
Pattern: ((.[^[0-9])+)(([0-9]{1,3}([.][0-9]{3})+)|([0-9]+))
My goal is to match any string (excluding digit number) followed by a specified number, e.g. MG2999, dasdassa33232
I used the above regular expression.
It's weird as follows:
V375 (not matched)
Vv375 (matched)
Vvv375 (not matched, but first character is not matched)
Vvvv375 (matched)
...
I don't understand why the first character is never matched. May I need your help?
For your quick test, please try: http://regex101.com/
Thanks in advance!
--
Vu
(.[^[0-9])+) matches any character (.), followed by any character except digits and [, repeatedly.
You probably want [^0-9]+ here – or, simpler, \D+.
The rest of there regular expression has similar problems but since I don’t know the number format you want to match I cannot correct that.

Regular Expression for matching a single digital followed by a word exactly in Notepad++

:Statement
Say we have following three records, and we just want to match the first one only -- exactly one digital followed by a specific word, what is the regular expression can be used to make it(in NotePad ++)?
2Cups
11Cups
222Cups
The expressions I tried and their problems are:
Proposal 1:\d{1}Cups
it will find the "1Cups" and "2Cups" substrings in the second and third record respectively, which is what we do not want here
Proposal 2:[^0-9]+[0-9]Cups
same as the above
(PS: the records can be "XX 2Cups", "YY22Cups" and "XYZ 333Cups", i.e., no assumption on the position of the matchable parts)
Any suggestions?
:Reference
[1] The reg definition in NotePad++ (Same as SciTe)
As mentioned in Searching for a complex Regular Expression to use with Notepad++, it is: http://www.scintilla.org/SciTERegEx.html
[2] Matching exact number of digits
Here is an example: regular expression to match exactly 5 digits.
However, we do not want to find the match-able substring in longer records here.
If the string actually has the numbered sequence (1. 2Cups 2. 11Cups), you can use the white space that follows it:
\s\d{1}Cups
If there isn't the numbered list before, but the string will be at the beginning of the line, you can anchor it there:
^\d{1}Cups
Tested in Notepad++ v6.5.1 (Unicode).
It sounds like you want to match the digit only at the start of the string or if it has a space before it, so this would work:
(^|\b)\dCups
Debuggex Demo
Explanation:
(^|\b) Match the start of the string or beginning of a word (technically, word break)
\d Match a digit ({1} is redundant)
Cups Match Cups
This will work:
\b\dCups
If "Cups" must be a whole word (ie not matching 2Cupsizes:
\b\dCups\b
Note that \b matches even if at start or end of input.
I found one possible solution:
Using ^\d{1}Cups to match "Starting with one digital + Cups" cases, as suggested by Ken, Cottrell and Bohemian.
Using [^\d]\dCups to match other cases.
However, haven't found a solution using just one regex to solve the problem yet.
Have a try with:
(?:^|\D)\dCups
This will match xCups only if there aren't digit before.