Hy,
please help me with a Regex to find phrases in a text.
My Regex is not ok. My assumption that phrases begin with uppercase and end with an dot, and between can contain anything.
\b([A-Z]+[aA-zZ]*\b(.)+)
Sincerly,
You can use the following if your between phrase doesn't also consist of a dot.
[A-Z][^.]*\.
Or perhaps, you could try using the following.
[A-Z].*?\.
Here is one variant
\b([A-Z][^.]*\.+)\b
Try this, it starts with a Capital letter, end with a dot, zero or more anything in between them:
^[A-Z].*[.]$
Related
Let's say I have the text a123456. I want a string of b123456 to match. So essentially, 'match if all characters are the same except for the first character'. Am I asking for the impossible with regex?
Use the dot (.) to match any character. So, a possible Regex would be:
/^.123456$/
If you want to use zero length assertion with regex, you can have lookbehind approach in following way :
(?<=\w)your_value$ // your_value should be text which you want to check
I think you can figure it out on your own. This ain't tough, just needs some understanding between you and Regex. Why don't you go through the following links and try to make a regex on your own.
https://www.talentcookie.com/2015/07/regular-expressions/
https://www.talentcookie.com/2015/07/lets-practice-regular-expression/
https://www.talentcookie.com/2016/01/some-useful-regular-expression-terminologies/
Given:
<currency: word with spaces, 56>
and the regular expression:
<(?:CURRENCY):[ ]*(\w+(\s*,\s*)\d+(\s*\d+)*)>
What must I change to accept spaces in the "words with spaces"
You're currently searching for \w which is a word character, the opposite of whitespace. Also, not sure if you're intending to capture a whitespace and commas, instead of the number values. This captures only the word and the numbers.
<CURRENCY:\s+?(.+)\s*,\s*(\d+)(?:\s*(\d+))?>
I find regex101.com to be helpful when debugging these things.
does this help?
<(?:currency)\:\s*\w(\w|\s)+,\s*\d+(\s*|\d+)*>
This should do:
<currency: ((\w+\s*)*), (\d+)>
another one:
<(?:currency:)\s([\w\s]*),\s\d+>
if don't want to capture empty string, change * to +
Also you don't need the non capturing group ?:
<currency:\s([\w\s]*),\s\d+>
would do the same.
This worked for me:
<(currency):[ ]*(\w+(\s+\w+)*\s*,\s*\d+(\s*\d+)*)>
Not sure what "?:" does, so you may want:
<(?:CURRENCY):[ ]*(\w+(\s+\w+)*\s*,\s*\d+(\s*\d+)*)>
I am trying to match the following two words, but for some reason, my regexp doesn't work
Here is what I'm trying to match: OCXXXXXX GXXXXXXX
X being any number or letter
Here is my regexp
OC[a-zA-Z0-9]+\sG[a-zA-Z0-9]+$
if I remove the dollar sign, it owrks, but I want the regexp to match exactly those two words and fail if there ar emote than those two words. Because of that, I want to use the $. Any ideas why this does not work?
May be you have a space at the end, try this:
OC[a-zA-Z0-9]+\sG[a-zA-Z0-9]+\b
or
OC[a-zA-Z0-9]+\sG[a-zA-Z0-9]+\s*
Thats weird, it is now working. I was using this website:
http://www.regexr.com/
I ended up uding
^OC[a-zA-Z0-9]+\sG[a-zA-Z0-9]+$
After I refreshed the website, it started working. Sorry about this post.
Thanks anyway M42
^\s*OC[a-zA-Z0-9]+\s+G[a-zA-Z0-9]+\s*$ should work.
Its anchored at beginning and end of string ^$ and allows for optional
whitespace at beginning or end and required whitespace between words.
The quantifiers on the whitespace are open ended.
i have a problem here, i have the following string
#Novriiiiii yauda busana muslim #nencor haha. wa'alaikumsalam noperi☺
then i use this regex pattern to select all the string
\w+
however, i need to to select all the string except the word which prefixed with # like #Novriiiiii or #nencor which means, we have to exclude the #word ones
how do i do that ?
ps. i am using regexpal to compile the regex. and i want to apply the regex pattern into yahoo pipes regex. thank you
You can use a negative lookbehind so that if a word is preceded by # it is excluded. You also need a word boundary before the word or else the lookbehind will only affect the first character.
(?<!#)\b\w+
http://rubular.com/r/ONEl70Am5Q
Does this suit your needs?
http://rubular.com/r/uuXvNrUiGJ
[^#\w+]\w+
This would sole your problem indeed:
[^#\w+][\w.]+
Check this link: http://regexr.com?34tq7
If you cannot use a negative lookbehind as other answers have already suggested, here's a workaround.
\w already doesn't match the # character, so you'd want something like this:
[^#]\w+
But this will (a) not work at the beginning of the string, and (b) include the character before the word in the match. To fix (a), we can do:
(^|[^#])\w+
To fix (b), we parenthesize the part we want:
(^|[^#])(\w+)
Then use $2 or \2 (depending on regex dialect) to refer to the matched word.
Another option is to include the # symbol in the word:
[\w#]+
And then add another step in your Pipe to filter out all words that start with an #.
A way to do that is to remove words that you don't want. Example:
find: #\w+
replace: empty string
you obtain the text without #abcdef words.
I'm very new at regex, and to be completely honest it confounds me. I need to grab the string after a certain character is reached in said string. I figured the easiest way to do this would be using regex, however like I said I'm very new to it. Can anyone help me with this or point me in the right direction?
For instance:
I need to check the string "23444:thisstring" and save "thisstring" to a new string.
If this is your string:
I'm very new at regex, and to be completely honest it confounds me
and you want to grab everything after the first "c", then this regular expression will work:
/c(.*)/s
It will return this match in the first matched group:
"ompletely honest it confounds me"
Try it at the regex tester here: regex tester
Explanation:
The c is the character you are looking for
.* (in combination with /s) matches everything left
(.*) captures what .* matched, making it available in $1 and returned in list context.
Regex for deleting characters before a certain character!
You can use lookahead like this
.*(?=x)
where x is a particular character or word or string.{using characters like .,$,^,*,+ have special meaning in regex so don't forget to escape when using it within x}
EDIT
for your sample string it would be
.*(?=thisstring)
.* matches 0 to many characters till thisisstring
Here is a one-line solution for matching everything after "before"
print $1."\n" if "beforeafter" =~ m/before(.*)/;
Edit:
While using lookbehind is possible, it's not required. Grouping provides an easier solution.
To get the string before : in your example, you have to use [^:][^:]*:\(.*\). Notice that you should have at least one [^:] followed by any number of [^:]s followed by an actual :, the character you are searching for.