Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 13 days ago.
Improve this question
I need to get all email addresses from TO, From and cc(Separately) from forwarded email body.
Is this possible with regex?
I am new to regex so don't understand it completely. Can someone help me build this logic?
REGEX LOGIC
You could use [a-zA-Z0-9-_.]+#[a-zA-Z0-9-_.]+ from regex extract email from strings then also add an additional part to it which would allow you to identify CC / To / From separately.
The pattern is 1 or more occurrences of:
a-z: any lowercase letter
A-Z: any uppercase letter
0-9: any digit
-_.: a hyphen, an underscore or a dot
So something like this could work CC:.*[a-zA-Z0-9-_.]+#[a-zA-Z0-9-_.]+. Here I added in CC:.* in the front which makes it so the regex will only grab the line that has CC in it.
.* just means "0 or more of any character"
It's broken down into two parts:
. - a "dot" indicates any character
* - means "0 or more instances of the preceding regex token"
From: What does .* do in regex?
EXAMPLE USE
CC:.*[a-zA-Z0-9-_.]+#[a-zA-Z0-9-_.]+ will grab the line that has a CC: in the front of it.
You would then do a secondary regex of just [a-zA-Z0-9-_.]+#[a-zA-Z0-9-_.]+ with out the CC: identifier to extract each email from the line separately.
Then just do this for each of the three lines you want to capture (to and from).
The regex used is this one: https://regex101.com/r/KIbf1T/1
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
My User data can come in any of the following 3 ways -
user="dc\AAA", user="BBB", user=CCCC,
Now, the bottom two I am able to extract it easily but issue comes when user data has an additional prefix of "dc" to it
I am trying to remove that prefix using regex and format all user data in single regex as below, but the unable to do so
user=AAA user=BBB user=CCC
Can someone please help.
This regex should do the work: (?:.*\\)?(.*).
Let's split this regex into parts:
(?: ) - A non-capturing group
.*\\ - Any characters many times, trailing by backslash
? (after the brackets) indicates the data in the brackets may occur once or not at all
(.*) Any characters
Overall - Capturing the data after the backslash if exists
I suggest using this amazing website for trying regex
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I am using LameXP to convert and encode audio files. These files are grouped poorly, but contain info in their filenames that could be used for this. Examples of files are as follows.
Genji-00000005818F.0B2-He'll talk.ogg
Tracer-00000005818C.0B2-Do you think Maximilien will talk_.ogg
Tracer-00000005818E.0B2-What does that mean_.ogg
Winston-00000005818D.0B2-He just deals with the money.ogg
LameXP offers a renaming tool that utilizes RegEx for find and replace. I would like to move the file ID (0000000XXXXX) to the beginning before the character name. What expression would I use to isolate the data ID and move it to the front?
Ideally, files would end up like this:
00000005818F_Genji-He'll talk.ogg
You need to provide more infomation that how the file name is formatted.
According to the examples you provided, this should do the trick:
Regex
^(.+?)-([0-9A-F]+)\.[0-9A-F]+-
^ Start of the string
(.+?) Any characters, as few as possible, use this to capture author name, and put it in group 1
- A dash
([0-9A-F]+) Any hexidecimal characters, put it in group 2
\. A dot
[0-9A-F]+ Another cascade of hexidecimal characters, use this to capture 0B2
- A dash
Substitution
\2_\1-
Check the test cases
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have the following string: test#gmail.com, test#gmail.com <test#gmail.com>, Test Gmail <test#gmail.com>, Test, Gmail <test#gmail.com>.
I would like to use a regex to obtain the following result in an array :
test#gmail.com,
test#gmail.com <test#gmail.com>,
Test Gmail <test#gmail.com>,
Test, Gmail <test#gmail.com>
It is unfortunately not possible to use the comma as separator because it can be included in the string preceding the email (ex : Test, Gmail <test#gmail.com>
((?>(?:\w+#\w+\.\w+)|(?:[^<>\n]+))(?> ?<(?:\w+#\w+\.\w+)>)?)(?:, )?
This will match your sample strings as you have written them.
(?>(?:\w+#\w+\.\w+)|(?:[^<>\n]+)) Match either an email, or some string of characters that does not have <, >, or \n in it, checking first for the email.
(?> ?<(?:\w+#\w+\.\w+)>)? Optionally match another email (with an optional space in front of it).
(?:, )? Optionally match a comma and a space.
Note that this will do nothing to validate whether the captured string is a correctly formatted email; it will only collect strings that look like them, and even then, will only do so only for the format of the specific input you've provided in your example.
Try it here!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I want any regex to remove everything after underscore in all anchor Tag e.g
input: Text
Output Text
Although you should avoid parsing HTML with regex, but since this is a case of anchor tag which won't be nested, hence you can do a quick work using regex. Use this regex to match the data in group1 and group2,
(<a\s+[^>]*?href=["'][^']*?)_.*?(["'])
and replace it with \1\2 (or $1$2 as per the language)
Check the demo
You haven't mentioned how should the data be replaced in case there are multiple underscores in the href attribute, so for now I have done it in a way where it replaces everything from first occurrence of underscore but you can easily do it for last occurrence of underscore by making the regex as greedy.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I need a regex that will check (if email is like a#b.cc:
email must have # and . (must have both). I refer to the whole string that must contain # and at least one dot.
1st word of email must have 1+ char
the domain name between # and . must be 1+ char
TLD must be 2+ char
I made regex like .+#.+\. but it's not the one, I know. I am bad in regex as I use it so rarely.
Can anyone help me?
It's not clear if you are matching an email in the middle of a paragraph of text, or matching an already extracted string. I am assuming the latter, and anchoring the match to start and end of line...
/^.+#.+\.[^.]{2,}$/
p.s. using regex to validate emails is complex: http://www.regular-expressions.info/email.html