Replace multiple IPs in multiple files with sed HP-UX - replace

Can anyone tell me how can i mass replace IPs in multiple files by 1 command? what does this sed command does?
sed 's/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/x.x.x.x/g' *
Really need help here. Thanks!

This sed does:
s/pattern1/pattern2/g
Replaces pattern1 with pattern2
[0-9]\{1,3\} = 1 to 3 digits from 0-9
\. means a single dot .
So in theory this should change all IP in all files to given IP x.x.x.x
* mean all files in this folder
So no original IP are left, so be careful with this.
PS this is not 100% working. Example this number 3452.343.13.34 (not IP) will be change to 3x.x.x.x

sed "s/\([12]\{0,1\}[0-9]\{0,1\}[0-9]\.\)\{3\}[12]\{0,1\}[0-9]\{0,1\}[0-9]/x.x.x.x/g"
but
If a number (digit) is before or after, it is ignored and consider internal part as IP
If number bigger than 255 and lower of 300 appear, they are still consider as IP
IP using a start with 0 are not include (like 120.008.099.234)
If those think count, a more complexe sed is to be build (cascade one I think) like
sed "s/.*/#&#/;s/\([^0-9.]\)\([012]\{0,1\}[0-9]\{0,1\}[0-9]\.\)\{3\}[12]\{0,1\}[0-9]\{0,1\}[0-9]\([^0-9.]\)/\1x.x.x.x\3/g;s/^#\(.*\)#$/\1/"
(still possible number between 255 and 300)

Related

regex - multiple $1 by 10

I want to replace the results of this:
(something=)([\-\d\.]*)
with this:
nowitis=($2*10)
but isntead of getting
nowitis=(80)
i get
nowitis=(8*10)
How to solve it?
In sed, for example:
echo "something=123" | sed -r 's/(something=)([\-\d\.]*)/\1\2*10)/'
something=123*10)
echo "something=123" | sed -r 's/(something=)([\-\d\.]*)/\1\20/'
something=1230
Multiplication by 10 is just adding a Zero to the number. Sed doesn't calculate results.
However, all regex implementations I know of, can have it a bit more easy:
echo "something=123" | sed -r 's/(something=)([-\d.]*)/\1\20/'
something=0123
In the group [-\d.], the - sign is leading, so it can't be part of a range like A-Z. Well, it could, it could mean from \0 to something, but it doesn't. As first or last character, it doesn't need a mask.
Similarly, every group containing a dot, if dot was interpreted as a joker sign, could be reduced to just that jokersign. Therefore you don't need a joker like this in the group. So you don't have to mask it too.
Let's suppose you are on a POSIX system with Perl available.
echo "something= 8" | perl -pe 's/\w\s*=\s*\K-?\d+(\.\d+)?/$&*10/ge'
something= 80
What you want to do is not possible with regular regex because they cannot do arithmetic e.g. compute 8*10. One way is to use an interpreter that can do so.
Perl has a nice feature which is the e switch. It evaluates the replacement pattern in which I do $& * 10, where $& is the captured pattern.
The input string can be like:
something=10.2
something=-3.15
So there can be negative numbers and float numbers.
I have a PHPStorm IDE and I'm using its find&replace function with regex
So it is fine but no multiplication.
So I think I could do it in couple runs.
For example in next run I would find mine results and then move the dot by 1.
I read the PCRE docs and didn't find multiplication option.
Easier would be writing a script even in PHP to do it right.
But I thought it could be done easier.

Using \d{1,3} when creating regex to find IPs

Why would one use {1,3} in \d{1,3} when catching an IP with grep? For example:
grep -Po 'inet addr:\K(?!127\.)\d{1,3}.\d{1,3}\.\d{1,3}\.\d{1,3}'
\K removes inet addr:, and (?!127\.), AFAIU, removes any address that starts with 127 (the loopback in that case), but what are the {1,3} after \d?
Clearly, we don't only want IP calsses that starts in 1 and end with 2 or 3 so the purpose there is unclear to me.
Note: inet addr: is part of the ifconfig Linux utility.
While writing the question I figured out the purpose: It means that in each class of the 4 classes, we will have not more than 3 digits.
Indeed in IPv4 (I don't know about IPv6) we have only 3 digits in each class.
You have answered your question yourself however note that for general IPv4 the regex that should be used is the following:
'\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b'
^^^^^^^^^^^^^^^^
that you could adapt to remove the localhost one.
In your case, the grep will also fetch chains of digits that are not proper IPs (e.g. integers > 255)

Use linux grep to find emails

If this question has been asked and answered before, my apologies. I couldn't find anything from looking.
How can I use linux grep / regex to find unknown characters in an email address? For example, let's say we had this list:
userone:123456#example.com
usertwo:123#example.com
userthree:12#example.com
how could I grep the list to find emails matching ***#example.com?
(the only email that should be found from this is 123#example.com)
I'm aware that grep -e '...#example\.com' would work, but periods can represent any characters in grep, so doing this would also find :12#example.com. Plus, MOST email address don't contain just any character, they are typically confined to letters, numbers, periods, and underscores (many email providers don't allow anything else)
I need to use something else besides a period symbol in grep, something like [a-Z0-9._] so that letters, numbers, periods, and underscores are included but nothing else. I'm unsure of how to go about this. Thanks
EDIT: What I've tried so far:
grep -e '[a-zA-Z0-9_.]{3}#example.com' *. This doesn't work, so it comes down to just me getting the regex wrong.
If the email addresses are always preceded by a username, which is then followed by a colon or a space and then the email address, you can use that knowledge to restrict your matches.
What does a username look like? You need to know if you're going to use it to find matches. Let's say for now it is letters, numbers, dash, and underscore, it always starts with a letter, and is from 2 to 12 characters long. We also know it's got a colon or space after it. The regex for that is
[A-Za-z][A-Za-z0-9_-]{1,11}[: ]
That would be followed by your email address which, it sounds like, is something you decide on and input because that's what you're looking for at the moment.
Your example of test*****#example.com would be
[A-Za-z][A-Za-z0-9_-]{1,11}[: ]test.\+#example.com
or, if exactly 5 chars after "test"
[A-Za-z][A-Za-z0-9_-]{1,11}[: ]test.....#example.com
Your original sample ***#example.com is "any 3-char address at example.com" and would be
[A-Za-z][A-Za-z0-9_-]{1,11}[: ]...#example.com
This would be a pain to retype that prefix all the time, so you'd want to wrap that in a script that uses prefix + what_i_typed as the pattern.
try this command line i used to found any thing in any files
grep -r -i #example.com ./

How to replace all email addresses in a set of files with a generic email address

I have some scripts which have many email address in different domain (say domain1.com,domain2.com . I want to replace all of them with some generic email address in a common domain, say domain.com, keeping rest of the script same.
I am using below in sed but it doesn't seem to work. (it is returning same output as input, so looks like the search is not matching. However, when I tested the regex \S+#\S+/ in online tester, it seems to match email addresses.)
s/\S+#\S+/genericid#domain.com/g
For example, I have 2 scripts
$ cat script1.sh
abcd.efg#domain.com
export SENDFROM="xyz#domain1.com" blah_4
$ cat script2.sh
echo foo|mailx -s "blah" pqr#domain2.com,def#domain.com,some#domain.com
omg#domain.com
foo abc#domain.com bar
My result after sed -i should be
$ cat script1.sh
genericid#domain.com
export SENDFROM="genericid#domain.com" blah_4
$ cat script2.sh
echo foo|mailx -s "blah" genericid#domain.com,genericid#domain.com,genericid#domain.com
genericid#domain.com
foo genericid#domain.com bar
I am using Linux 3.10.0-327.28.2.el7.x86_64
Any suggestion please?
Update:
I managed to make it work with 's/\S\+#\S\+.com/genericid#domain.com/g'. There were 2 problem with previous search.
The + needed \ before it.
As file had other # lines (for
database connections), I had to append .com at the end, as all my
addresses ended in .com
Capturing email adresses using regex can be more difficult than it seems.
Anyhow, for replacing the domain, I think you could simplistically consider that an email domain starts when it find:
1 alphanum char + # + N alphanum chars + . + N alphanum chars
Based on this preconception, in javascript I would do so:
(\w#)(\w*.\w*)
Replacing with:
$1newdomain.com
Hope it helps you.
UPDATE - Other answers, and comments on this one, point out that you may have to take extra steps to enable shorthand character class matching; I'm used to doing regex in perl, where this just works, so didn't think to address that possibility. This answer only addresses how to improve the matching once you have the regex functioning.
--
While the problem of matching email addresses with regex can be very complex (and in fact in the most general case isn't possible with true regex), you probably can handle your specific case.
The problem with what you have is that \S matches any non-whitespace, so address#something.com,address#somethingelse.com, where two addresses have no whitespace between, matches incorrectly.
So there's a couple ways to go about it, based on knowing what sorts of addresses you realistically will see. One solution would be to replace both instances of \S with [^\s,] (note the lowercase s), which simply excludes , from the match as well as whitespace.
Try this
sed s/[^,#]*#[^,]*/genericid#domain.com/g
and
echo 'pqr#domain2.com,def#domain.com,some#domain.com' | sed s/[^,#]*#[^,]*/genericid#domain.com/g
result
genericid#domain.com,genericid#domain.com,genericid#domain.com
Still UNIX-related, though requiring the more modern and far from ubiquitous tool, Ammonite, you could use email-replace.
$ amm path/to/email-replace.sc <random integer seed> <file1 with emails> <file2 with emails> ...
DISCLAIMER: the matcher is likely far from perfect, so use at your own risk, and always have backups available.
Note that by default it replaces emails with a new random e-mail address. To use a fixed email address, just replace the call to randEmail with a constant string.

How can I match zero or more instances of a pattern in bash?

I'm trying to loop through a bunch of file prefixes looking for a single line matching a given pattern from each file. I have extracted and generalized a couple examples and have used them below to illustrate my question.
I searched for a line that may have some spaces at the beginning, followed by the number 1234, with maybe some more spaces, and then the number 98765. I know the file of interest begins with l76.logsheet and I want to extract the line from the file that ends with one or more numbers. However, I want to make sure I exclude files ending with anything else (of which there are too many options to reasonably use the grep --exclude option). Here's how I did it from the tcsh shell:
tcsh% grep -E '^\s{0,}1234\s+98765' l76.logsheet[0-9]{0,}
l76.logsheet10:1234 98765 y 13:02:44 2
And here's another example where I was again searching for 98765, but with a different number out front and a different file prefix:
tcsh% grep -E '^\s{0,}4321\s+98765' k43.logsheet[0-9]{0,}
k43.logsheet1: 4321 98765 y 13:06:38 14
Works great and returns just what I need.
My problem is with the bash shell. Repeating the same command returns a rather interesting result. With the first line, there are no problems:
bash$ grep -E '^\s{0,}1234\s+98765' l76.logsheet[0-9]{0,}
which returns:
l76.logsheet10:1234 98765 y 13:02:44 2
But the result for the second example only has one digit at the end of the filename. This causes bash to throw an error before providing the correct result:
bash$ grep -E '^\s{0,}4321\s+98765' k43.logsheet[0-9]{0,}
grep: k43.logsheet[0-9]0: No such file or directory
k43.logsheet1: 4321 98765 y 13:06:38 14
My question is, how do I search for files ending in zero or more of the previous pattern from the bash shell? I have a work around, but I'm looking for an actual answer to this question, which may save me (and hopefully others) time in the future.
First, make sure that extglob is set:
shopt -s extglob
Now, we can match zero or more of any pattern with *(...). For example, let's create some files and match them:
$ touch logsheet logsheet2 logsheet23 logsheet234
$ echo logsheet*([0-9])
logsheet logsheet2 logsheet23 logsheet234
Documentation
According to man bash, bash offers the following features with extglob:
?(pattern-list)
Matches zero or one occurrence of the given patterns
*(pattern-list)
Matches zero or more occurrences of the given patterns
+(pattern-list)
Matches one or more occurrences of the given patterns
#(pattern-list)
Matches one of the given patterns
!(pattern-list)
Matches anything except one of the given patterns