Regex replace phone numbers with XXXXXXXXXX - regex

I have a text file consisting of few phone numbers and other important data. I would like to replace all the phone numbers to a predefined text, lets say XXXXXXXXXX.
How to do it using sed/awk? The regex
^\s*(?:\+?(\d{1,3}))?[-. (]*(\d{3})[-. )]*(\d{3})[-. ]*(\d{4})(?: *x(\d+))?\s*$
did not work for me.
Input:
Add me 7598128789
Pls add mi 9761500634
Add 8870504046
spam post
magar maddam is not required
all hero hain
All follows
Output:
Add me XXXXXXXXXX
Pls add mi XXXXXXXXXX
Add XXXXXXXXXX
spam post
magar maddam is not required
all hero hain
All follows

you can do like in perl.
cat a |perl -npe 's/\d{10}/XXXXXXXXXX/g'

try:
gawk '{gsub(/[0-9]{10}/,"XXXXXXXXXX");print}' Input_file
simply substituting 10 continuous digits with 10 number of X string and then printing the line.

with GNU sed:
sed -r 's/\b[0-9]{10}\b/XXXXXXXXXX/g' filename

Related

Regex grep command to select only passwords that start with a number and end with a number

I'm trying to formulate a grep regex expression that selects only passwords that start with a number and ends with a number. The format of the txt file is:
password, #OfUsersWhoUseThisPassword
(The comma is included) for example:
123456, 25969
12345678, 8667
1234, 5786
qwerty, 5455
dragon, 4321
Regex:
^[0-9]{1}.*[0-9]{1}(?=,)
Demo
Try this :
grep -oP '^\d.*?\d(?=,)' /tmp/file
-o is to print only what is matching
-P is to use perl like regex
Check look around. Check online with explanations

regex, repeat, count group

i need some help with a regex that follows up this format:
First part of the string is a email address, followed by eight columns divided by ";".
a.test#test.com;Alex;Test;Alex A.Test;Alex;12;34;56;78
the first part i have is (.*#.*com)
these are also possible source strings:
a.test#test.com;Alex;;Alex A.Test;;12;34;56;78
a.test#test.com;Alex;;Alex A.Test;Alex;;34;;78
a.test#test.com;Alex;Test;;Alex;12;34;56; and so on
You can try this regex:
^(.*#.*com)(([^";\n]*|"[^"\n]*");){8}(([^";\n]*|"[^"\n]*"))$
If you have a different number of columns after the adress change the number between { and }
For your data here the catches:
1. `a.test#test.com`
2. `56;`
3. `56`
4. `78`
Here the test
If you are sure there will be no " in your strings you can use this:
^(.*#.*com)(([^;\n]*);){8}([^;\n]*)$
Here the test
Edit:
OP suggested this usage:
For use the first regex with sed you need -i -n -E flags and escape the " char.
The result will look like this:
sed -i -n -E "/(.*#.*com)(([^\";\n]*|\"[^\"\n]*\");){8}(([^\";\n]*|\"[^\"\n]*\"))/p"
you can have something like
".*#.*\.com;[A-Z,a-z]*;[A-Z,a-z]*;[A-Z,a-z, ,.,]*;[A-Z,a-z]*;[0-9][0-9];[0-9][0-9];[0-9][0-9];[0-9][0-9]"
Assuming the numbers are only two digit
Using awk you can do this easily:
awk -F ';' '$1 ~ /\.com$/{print NF}' file
9
9
9
cat file
a.test#test.com;Alex;;Alex A.Test;;12;34;56;78
a.test#test.com;Alex;;Alex A.Test;Alex;;34;;78
a.test#test.com;Alex;Test;;Alex;12;34;56; and so on

Regex to extract content from each line of a log file output from '_m' to the end of the line

Format of log line:
Xxx x xx:xx:xx xmmxxx XXXXXX: XXXXXXX:XXX: xxx_Mxxx_Xxxxxx_mxxxxxmmxx [XXX xxxx.
I want to extract from '_m' to the end of the line, removing the '_' before the 'm'.
New to regex...
Thanks!
if your tool/language support look-behind, this works: match the first _m till EOL. also ignore the leading _
(?<=_)m.*
test with grep:
kent$ echo "Xxx x xx:xx:xx xmmxxx XXXXXX: XXXXXXX:XXX: xxx_Mxxx_Xxxxxx_mxxxxxmmxx [XXX xxxx."|grep -Po '(?<=_)m.*'
mxxxxxmmxx [XXX xxxx.
With sed:
sed -n 's/^.*_\(m.*$\)/\1/p' file
It is quite easy:
This example is written in C# however the regex is quite general and will probably work anywhere:
Regex regex = new Regex(#"_(m.*)"); // If you look for _M the regex should be #"_(M.*)"
Match match = regex.Match(logLine);
if (match.Success)
Console.WriteLine(match.Groups[1].Value);
Hope this will help you on your quest.

SED: Inserting an existing pattern, to several other places on the same line

Again a SED question from me :)
So, same as last time, I'm wrestling with phone numbers. This time the problem is a bit different.
I this kind of organization currently in my text file:
Areacode: List of phone numbers:
4444 NUM:111111 NUM:2222222 NUM:33333333
5555 NUM:1111111 NUM:2222 NUM:3333333 NUM:44444444 NUM:5555555
Now, every areacode can have unknown number of numbers, and also the phone numbers are not fixed in length.
What I would like to know, is how could I combine areacode and phone number, to look something like this:
4444-111111, 4444-2222222, 4444-33333333
My first idea was to add again a line break before each phone number and to match these sections with regex, and then just add the first remembered item to second, and first to third:
\1-\2, \1-\3, etc
But of course since sed can only remember 9 arguments, and there can be more than 10 numbers in one line this doesn't work. Moreover, also non-fixed list of phone numbers made this a no go.
I'm again looking primarily the SED option, as I've been trying to get proficient with it - but more efficient solutions with other tools are of course definitely welcome!
$ cat input.txt | sed '1d;s/NUM:/ /g' | awk '{for(i=2;i<=NF;i++)printf("%s-%s%s", $1, $i, i==NF?"\n":",")}'
4444-111111,4444-2222222,4444-33333333
5555-1111111,5555-2222,5555-3333333,5555-44444444,5555-5555555
This might work for you:
sed '1d;:a;s/^\(\S*\)\(.*\)NUM:/\1\2,\1-/;ta;s/[^,]*,//;s/ //g' file
4444-111111,4444-2222222,4444-33333333
5555-1111111,5555-2222,5555-3333333,5555-44444444,5555-5555555
or:
awk 'NR>1{gsub(/NUM:/,","$1"-");sub(/[^,]*,/,"");gsub(/ /,"");print}' file
4444-111111,4444-2222222,4444-33333333
5555-1111111,5555-2222,5555-3333333,5555-44444444,5555-5555555
TXR:
#(collect)
#area #(coll :mintimes 1)NUM:#{num /[0-9]+/}#(end)
#(output)
#(rep)#area-#num, #(last)#area-#num#(end)
#(end)
#(end)
Run:
$ txr phone.txr phone.txt
4444-111111, 4444-2222222, 4444-33333333
5555-1111111, 5555-2222, 5555-3333333, 5555-44444444, 5555-5555555
$ cat phone.txt
Areacode: List of phone numbers:
4444 NUM:111111 NUM:2222222 NUM:33333333
5555 NUM:1111111 NUM:2222 NUM:3333333 NUM:44444444 NUM:5555555

Simplify points in KML using regex

I am trying to cut down the file size of a kml file I have.
The coordinates for the polygons are this accurate:
-113.52106535153605,53.912817815321503,0.
I am not very good with regex, but I think it would be possible to write one that selects the eight characters before the commas. I'd run a search and replace so the result would be
-113.521065,53.9128178,0.
Any regex experts out there think this is possible?
Try this
\d{8}(?=,)
and replace with an empty string
See it here on Regexr
Here is something that might work. Replaces 8 chars and the coma with a coma: s/(.{8}),/,/g;
echo "-113.52106535153605,53.912817815321503,0." | sed 's/.\{8\},/,/'
So you can cat the file you have to a sed command like this:
cat file.kml | sed 's/.\{8\},/,/' > newfile.kml
I Just had to do the same thing. This is perl instead of sed, but it will look for a string of eight uninterrupted digits and then replace any number of uninterrupted digits after that with nothing. It worked great.
cat originalfile.kml | perl -pe 's/(?<=\d{8})\d*//g' > shortenedfile.kml