How to put .com at the end of email addressed by regex? - regex

Example
I received a email-list from my friends but the problem is some people typed an email in full form (xxx#example.com) and some people typed (xxx#xxx without .com). And i want to improve it into the same format. How can i improve it if i want to edit them on vi?
In my emaillist.txt
foo#gmail
bar#hotmail.com
bas#gmail
qux#abc.com
mike#abc
john#email
My try:
i tried to use an easy regex like this to catch the pattern like xxx#xxx
:%s/\(\w*#\w*\)/\0.com/g
or
:%s/\(\w*#\w*[^.com]\)/\0.com/g
But the problem is this regex include xxx#example.com also
And the result become like this after i enter the command above
foo#gmail.com
bar#hotmail.com.com
bas#gmail.com
qux#abc.com.com
mike#abc.com
john#email.com
So, My expectation after substitution is should be like this:
foo#gmail.com
bar#hotmail.com
bas#gmail.com
qux#abc.com
mike#abc.com
john#email.com
How to use regex in this situation?

You can use this command:
%s/^.*\(\.com\)\#<!$/\0\.com/g
The search pattern matches each line not ending with .com (i just copy-pasted the recipy from Vim: Find any line NOT ending in "WORD") and replaces it with itself with .com added.

for gmail.com there is no need of further replace so do replace for only gmail like this
/(.*)(?!\.com)\n/\.com/msi ( i considered as each mail in one new line. )
pls dont -vte mar i tried to explain

Related

Multiple slash in URL replacement though regex

I am trying to create a regex in pcre, that is going to salinize URL with multiple slashes like the following:
https://www.domin.com/test1/////test2/somemoretests_67142 https://www.domin.com/test1/test2/somemoretests_67142///// https://www.domin.com/test1/test2///somemoretests_67142
So that I can replace it with the following: https://\2\4 and the link at the end of it looks: https://www.domin.com/test1/test2/somemoretests_67142
I have been struggling with it for the past couple of days, so any regex guru help is more than welcome :)
I have tried the following and more:
(http|https):\/\/(.*)(\/\/+)(.*)
(http|https):\/\/(.*)(\/\/){2,}(.*)
(http|https):\/\/(.*)(\/\/{2})(.*)
I am going to utilize these for Akamai to sanitize our URLs though cloudlet.
You can try:
(?<!https:\/)(?<!http:\/)(\/+$|(?<=\/)\/+)
And substitute the first group with empty string.
Regex demo.
This will produce this output:
https://www.domin.com/test1/test2/somemoretests_67142
https://www.domin.com/test1/test2/somemoretests_67142
https://www.domin.com/test1/test2/somemoretests_67142

find string that match specific regex then exclude from that list anything that match one ore more conditions

i'm using astrogrep to search email
to make it very tolerant i'm using this regex: [a-z0-9_.-]+#[a-z0-9_.-]+
but I have to ignore some email from that search
if i find ABCD or YXZ at the begining of the email, like ABCDsomething#something.com and YXZ something#something.com I have to exclude it from the result
I have tried a fews things like;
(?!abcd)|([a-z0-9_.-]+)#[a-z0-9_.-]+
^(?!abcd+)|([a-z0-9_.-]+)#[a-z0-9_.-]+
(?!abcd)([a-z0-9_.-]+)#[a-z0-9_.-]+
(?!abcd+)([a-z0-9_.-]+)#[a-z0-9_.-]+
etc...
this seem easy when i did my search on google but it seem i cannot find a way to make it work
edit
create 3 text file in a folder
first file contain 3 lines:
abcdsomething#something.com
xyzsomething#something.com
something#something.com
second file 1 line
something#something.com
third file 3 lines
email1="abcdsomething#something.com"
email2="xyzsomething#something.com"
email3="something#something.com"
with astrogrep search that folder, case NOT sensitive
expected result 1 email found in each file
with [a-z0-9_.-]+#[a-z0-9_.-]+ i get all email properly but i just want to ignore the one that start with abcd / xyz
it seem that the answer was this \b(?!abcd|xyz)[a-z0-9_.-]+#[a-z0-9_.-]+
You can use negative lookahead like this:
(?!\b(abcd|xyz))[a-z0-9_.-]+#[a-z0-9_.-]+\b
This will fail the match if abcd or xyz is found at the start of input.

Regex remove last dot from string in Yahoo Pipes

I have a couple of strings that end with a dot (.) at the end of the sentence which I need to remove in Yahoo Pipes.
Example:
example.com.
companywebsite.co.uk.
anothersite.co.
I've tried the following from a couple of posts here on SO but none have worked yet
/\.$/
or
^(.*)\\.(.*)$","$1!$2
Neither of these options have worked
I have tried a very simple find of
.com. and replace with .com
and
.co. to replace with .co
But the latter affects .com as well which is not ideal
EDIT: Here is a visual of what my pipe looks like.
If you can do something like this: ^(.*)\\.(.*)$","$1!$2, then doing this should work: "^(.+?)\.?$", $1. This should match the first part of the URL and leave out the period at the end, should it exist.
EDIT:
As per your image, you should place this: ^(.+?)\.?$ in your replace field and this: $1 in your with field. I do not know if you need to do any escaping, so you might have to use ^(.+?)\\.?$ instead of ^(.+?)\.?$.

simplify a regex to reduce recursion

I currently have a regex like this:
/^From: ((?!\n\n).)*\nSubject:.+/msu
with the point of matching a block that looks like this:
From: John Smith
Cc: Jane Smith
Subject: cat videos
(ie- where they're in a contiguous block) but not if there is a blank line breaking up the block, like this:
From: John Smith
Subject: cat videos
but I've been finding that my PHP script that uses this is sometimes segfaulting. I was able to mitigate the segfaults by setting pcre.recursion_limit to a lower number (I used 8000), but it occurs to me that what I'm trying to do should be doable without a great deal of recursion. Am I using a horribly inefficient method of catching the \n\n ?
This is just a terrible use for a single regex. In addition to the performance problems you're having, it's going to fail at straightforward problems like messages with the "Subject:" line appearing before "From:". If you want to parse a RFC822 email header, then you really should be parsing it.
Find the empty line terminator of the header. Join lines beginning with whitespace to the previous line (i.e. replace newline-followed-by-whitespace with a space). Split each line at the first colon and snip leading and trailing whitespace from each side.
Or find an appropriate library to do that for you.
You should not use regex to parse mail message reliably. Better use a PHP Mime Mail Parser for this task. Using Mime Mail Parser code will be as simple as:
require_once('MimeMailParser.class.php');
$path = 'path/to/mail.txt';
$Parser = new MimeMailParser();
$Parser->setPath($path);
$to = $Parser->getHeader('to');
$from = $Parser->getHeader('from');
$subject = $Parser->getHeader('subject');
$textBody = $Parser->getMessageBody('text');
$htmlBody = $Parser->getMessageBody('html');
I would use simply "not a newline":
/^From:[^\n]*\nSubject:.+/msu

match regex to remove password from file

I have a file which contain following lines
app.mail.host = 10.1.1.1
app.mail.debug = true
app.db.username = spate
app.db.password = 1#4FnL&#7!
I want to use regex to change actual password and replace with XXXXXXX for security purpose.
I am trying following regex but it doesn't working.
sed 's/app\.db\.password="[^"]\+/app\.db\.password="XXXXXXXX"/g' foo.txt
As Hbcdev noted, you aren't matching due to whitespace. As this appears to be "security" code (in which case -- why are you storing that password in plaintext at all?), it's probably better to be whitespace-tolerant than match the input byte-for-byte. Something like:
sed 's/app\.db\.password[ \t]*=.*/app.db.password="xxxxx"/
(untested) is probably going to work a little more robustly. Note that it will strip your password field even if it doesn't begin with a quote.
Still, doing this kind of hackery with a shell script sounds dangerous. What are you trying to accomplish?
You haven't left spaces around the = sign in your sed expression. Once you do that I think it'll work.