I'm trying to match a single string out of an email using regex. The email pattern looks like:
name.name.someid#mail.domain.com
And I would like to grab the 'someid' section. Meaning I need to match everything before the '#' and after the last period.
I can match everything before the '#' with (^[^#]+) however I can't effectively combine it in the regex statement to evaluate only after the last period (I can only get it to match after the first period).
Any pointers would be great, thanks!
Use a positive lookahead:
/[^.]+(?=#)/
Here's a demo: http://regex101.com/r/sW7sR3
/\.([^.#]+)#/
Without using lookarounds, this matches anything that's not an # or . that comes after a . and before #.
Related
I got a small problem with my regex.
I want to pare a p-list file to get a unix (10 digis) timestamp plus everything until a certain pattern after the timestamp. My current pattern looks like that:
,\s*(\d{10}),\s*'(?=.[',])
I want to match the timestamp and everything between the timestamp and the certain pattern ',.
This is a snipped of the string, out of the p-list:
'$class': UID(23)}, 1572871204, 'I need this one', {'dictionary': UID(34)
I want to get:
1573078965, 'I need this one'
It would be ideal if I get the timestamp as a submatch and the string as another submatch.
Thank you very much!
Between the positive lookahead, you could add another capturing group matching not a comma or ' using a negated character class ([^,']+).
But as you are matching the comma before as well, you can omit the lookahead and match the comma afterwards instead.
For example
,\s*(\d{10}),\s*'([^,']+)[',]
Regex demo
Any ideas on how to remove all periods from a large text document, by using a regex on a text editor for the following example:
J. don't match
F.C. don't match
word. match
Word. match
WORD. match
Below regex matches multiple word characters or single non-capital string followed by .:
((\w{2,})|([^A-Z]))\.$
You can try this too,
(?<!(?<=^|[^A-Z])[A-Z])\.
Demo
You can try something like this: \w{2,}?\.
You can go to Regex101 and try it for yourself with more test strings to get the one you want. If you want to actually exclude the periods you can use a capturing group like so: (\w{2,}?)\.
I'm trying to do a single javascript regex that matches email addresses that start with lcp_ but ignore any matches that also contain the word auto at any position.
I've tried a few things with no luck
/^lcp[._-](?!auto)/gi
The goal is following
lcp_land#blah.com - match
lcp_land_auto#blah.com - no match
Thanks
You can you use a "tempered greedy token" which basically means you are checking a negative lookahead with each repetition of the sub-pattern, so as to exclude the illegal string at any position, something like this for example:
\blcp_(?:(?!auto)\S)+(?=\s)
https://regex101.com/r/I0xisv/2
I'm trying to fix a regex I create.
I have an url like this:
http://www.demo.it/prodotti/822/Panasonic-TXP46G20E.html
and I have to match the product ID (822).
I write this regex
(?<=prodotti\/).*(?<=\/)
and the result is "822/"
My match is always a group of numbers between two / /
You're almost there!
Simply use:
(?<=prodotti\/).*?(?=\/)
instead of:
(?<=prodotti\/).*(?<=\/)
And you're good ;)
See it working here on regex101.
I've actually just changed two things:
replaced that lookbehind of yours ((?<=\/)) by its matching lookahead... so it asserts that we can match a / AFTER the last character consumed by .*.
changed the greediness of your matching pattern, by using .*? instead of .*. Without that change, in case of an url that has several / following prodotti/, you wouldn't have stopped to the first one.
i.e., given the input string: http://www.demo.it/prodotti/822/Panasonic/TXP46G20E.html, it would have matched 822/Panasonic.
I have a simple regular expression looking for twitter style tags in a string like so:
$reg_exUrl = "/#([A-Za-z0-9_]{1,15})/";
This works great for matching words after an # sign.
One thing it does though which I don't want it to do is match full stops.
So it should match
"#foo"
but should not match
"#foo."
I tried adding amending the expression to dissallow full stops like so:
$reg_exUrl = "/#([A-Za-z0-9_]{1,15})[^\.]/";
This almost works, except it will not match if it's at the end of the string. EG:
It WILL match this "#foo more text here"
but won't match this "#foo"
Could anyone tell me where I'm going wrong?
Thanks
First of all your original expression can be written like the following:
/#\w{1,15}/
because \w is equivalent to [A-Za-z0-9_].
Secondly your expression doesn't match names with . so you probably meant that you don't want to match names ending with a dot and this can be done with the following:
/#\w{1,15}(?![^\.]*\.)/
Or if you want to match a name no matter how long it is just not ending with a dot then
/#\w+(?![^\.]\.)/
Oh ya, I forgot one thing, your problem was caused by the absence of any anchor characters such as the start of line ^ and end of line $, so you should use them if you want to match a string that contains only a twitter name which you wish to validate.
Summary: If you want to match names anywhere in the document don't use anchors, and if you want to know whether a given string is a valid name use the anchors.
It's not working if it's at the end of the string because it's expecting [^\.] after it.
What you are wanting, you can do with a negative lookahead to make sure there is no dot afterwards, like this:
/#([A-Za-z0-9_]{1,15})(?![^\.]*\.)/
Test it here
You could also do it this way:
/#([A-Za-z0-9_]{1,15})([^\.]*)$/
Test it here
This one allows for optional characters other than a dot, and then it has to be the end of the string.
A $ matches the end of the string, and for future reference, a ^ matches the begining:
$reg_exUrl = "/#([A-Za-z0-9_]{1,15})$/";