add dot before first integer in all lines - regex

I have lines like
Input:
abcd1234
bdfghks4506
agfdch6985
I would like to add "." before the first integer in line, how do I do it?
Output:
abcd.1234
bdfghks.4506
agfdch.6985

This might work for you (GNU sed):
sed -i 's/[[:digit:]]/.&/' file
If there is a digit in a line, put a . before it.
N.B. To put a . before every digit in a file use:
sed -i 's/[[:digit:]]/.&/g' file

$ cat > input.txt
abcd1234
bdfghks4506
agfdch6985
$ sed -e 's/^\([^0-9]*\)\([0-9]\)\(.*\)$/\1.\2\3/' input.txt
abcd.1234
bdfghks.4506
agfdch.6985
Use sed string replacement with regular expression capture groups.
Match the beginning of the line.
Start a capture group that matches any number of non-numeric characters.
Start a second capture group that matches a single digit.
Start a third capture group that matches the remainder of the line.
Match the end of the line.
Replace the entire line with the contents of the first capture group, ".", the second capture group and, finally, the third capture group.

The function sub of awk may help,
$ awk 'sub(/[0-9]/,".&",$0)1' file
abcd.1234
bdfghks.4506
agfdch.6985
Brief explanation,
sub: replace only the first matching substring in each line
&: is replaced with the text that was actually matched (i.e. [0-9])
Appended 1: to print the result.

The most strict command for your case would be
sed -E -i 's/([a-z])([1-9])/\1\.\2/' file.txt
-E Use extended regex
-i '' Replace in file (instead of writing to output)
This will match any example you provided

Not familiar with awk / sed specifically, but a regex replace using this regex should be all you need:
Search: (\d+.*?)$ (match everything from the first found number to the end of the line)
Replace by: .$1 (captured group #1 prefixed by a literal .)
The notation of the capture group in the replace command may differ depending on the implementation. I used $1 here, but some implementations may use \1.

Related

How can I replace a string containing special characters?

I have a text file that contains a line with brackets, character, integers, and : symbols.
I want to replace [0:1] with [2:4]
$ cat input.txt
str(tr.dx)[0:1]
Expected output:
str(tr.dx)[2:4]
I tried
sed -i 's/str(tr.dx)[0:1]/str(tr.dx)[2:4]/g' input.txt
but it does not work. How can I fix this?
You may use this sed:
sed 's/\(str(tr\.dx)\)\[0:1]/\1[2:4]/' file
str(tr.dx)[2:4]
Here:
\(str(tr\.dx)\) matches str(tr.dx) and captures it in group #1
We need to escape the dot in regex
\[0:1] matches [0:1]. Here we need to escape [
\1 is back-reference for capture group #1

replace last n parts after spliting on delimiter using sed or regex

I need to replace last 2 parts of the string separated by delimiter with empty space to clean up the name.
Example:
something-useful-a12356-78929
=>
something-useful
something-more-useful-v35f62-2728902
=>
something-more-useful
I tried the following:
echo "something-useful-12345-67890" | sed -re 's/(-([0-9])+)//g'
This works if my last 2 elements of delimiter are numbers only, but wouldn't work for the example above. I need to remove the last 2 parts after splitting it on "-"
I can only use sed or regex to solve this.
Does sed 's/\(-[^-]*\)\{2\}$//' file does what you want?
Use [^-] to match anything other than -. Use $ to match the end of the string. Match hyphen followed by non-hyphens twice at the end.
sed -r 's/(-[^-]+){2}$//'
This might work for you (GNU sed):
sed -re 's/-[^-]*//2g' file
Removes globally from the second occurrence of - followed by non - characters.

Sed removing after ip

I have a simple sed question.
I have data like this:
boo:moo:127.0.0.1--¹óÖÝÊ¡µçÐÅ
foo:joo:127.0.0.1 ÁÉÄþÊ¡ÉòÑôÊвʺçÍø°É
How do I make it like this:
boo:moo:127.0.0.1
foo:joo:127.0.0.1
My sed code
sed -e 's/\.[^\.]*$//' test.txt
Thanks!
For the given sample, you could capture everything from start of line till last digit in the line
$ sed 's/\(.*[0-9]\).*/\1/' ip.txt
boo:moo:127.0.0.1
foo:joo:127.0.0.1
$ grep -o '.*[0-9]' ip.txt
boo:moo:127.0.0.1
foo:joo:127.0.0.1
Or, you could delete all non-digit characters at end of line
$ sed 's/[^0-9]*$//' ip.txt
boo:moo:127.0.0.1
foo:joo:127.0.0.1
You may find an IP like substring and remove all after it:
sed -E 's/([0-9]{1,3}(\.[0-9]{1,3}){3}).*/\1/' # POSIX ERE version
sed 's/\([0-9]\{1,3\}\(\.[0-9]\{1,3\}\)\{3\}\).*/\1/' # BRE POSIX version
The ([0-9]{1,3}(\.[0-9]{1,3}){3}) pattern is a simplified IP address regex pattern that matches and captures 1 to 3 digits and then 3 occurrences of a dot and again 1 to 3 digits, and then .* matches and consumes the rest of the line. The \1 placeholder in the replacement pattern inserts the captured value back into the result.
Note that in the BRE POSIX pattern, you have to escape ( and ) to make them a capturing group construct and you need to escape {...} to make it a range/interval/limiting quantifier (it has lots of names in the regex literature).
See an online demo.

Extract QueryString value using sed

I have the following lines in an apache access log
/sms/receiveHLRLookup?Ported=No&Status=Success&MSISDN=647930229655&blah
/sms/receiveHLRLookup?Ported=No&Status=Success&MSISDN=647930229656&blah
/sms/receiveHLRLookup?Ported=No&Status=Success&MSISDN=647930229657&blah
/sms/receiveHLRLookup?Ported=No&Status=Success&MSISDN=647930229658&blah
and i want to extract the MSISDN value only, so expected output would be
647930229655
647930229656
647930229657
647930229658
I'm using the following sed command but i can't get it to stop capturing at &
sed 's/.*MSISDN=\(.*\)/\1/'
sed solution:
sed -E 's/.*&MSISDN=([^&]+).*/\1/' file
& - is key/value pair separator in URL syntax, so you should rely on it
([^&]+) - 1st captured group containing any character sequence except &
\1 - backreference to the 1st captured group
The output:
647930229655
647930229656
647930229657
647930229658
-o : means print only matching string not the whole line.
-P: To enable pcre regex.
\K: means ignore everything on the left. But should be part of actual input string.
\d: means digit, + means one or more digit.
grep -oP 'MSISDN=\K\d+' input
647930229655
647930229656
647930229657
647930229658
Following simple sed may help you on same.
sed 's/.*MSISDN=//;s/&.*//' Input_file
Explanation:
s/.*MSISDN=//: s means substitute .*MSISDN= string with // NULL in current line.
; semi colon tells sed that there is 1 more statement to be executed.
s/&.*//g': s/&.*// means substitute &.* from & to everything with NULL.
$ grep -oP '(?<=&MSISDN=)\d+' file
647930229655
647930229656
647930229657
647930229658
-o option is meant to show only matched output
-P option is meant to enable PCRE (Perl Compatible Regex)
(?<=regex) this is to enforce positive look behind assertion. You can read more about them over here. Lookarounds dont consume any characters while matching unlike normal regex. Hence the only matched output you get it \d+ which is 1 or more digits.
or using sed:
$ sed -r 's/^.*MSISDN=([0-9]+).*$/\1/' file
647930229655
647930229656
647930229657
647930229658
you can also pipe cut to cut
cut -d '&' -f3 Input_file |cut -d '=' -f2

Can I find and replace multiple words at the same time in sublime text 3?

I want to find and replace several words at the same time in sublime, for example I want to replace word1,word2,word3 by word4,word5,word6 at the same time instead of doing them in 3 steps. Can I do it in sublime? if not how can I come up with a regexp to do this?
Thank you
In this case sed command might help you.
sed -i -e 's/word1/replace1/g' fileName && sed -i -e 's/word2/replace2/g' fileName && ...
and this goes on how many of words you need to replace, you can. Just run this on your terminal
Example:
this is a content of file. File name is t.txt
I want to replace this with that and file with FILE. then command will be :
sed -i -e 's/this/that/g' t.txt && sed -i -e 's/file/FILE/g' t.txt
output will be :
that is a content of FILE. File name is t.txt
You can use the pattern word1(.*)word2(.*)word3 to match the words, if the words to be replaced come in sequential order with or without some characters in between and use word4\1word5\2word6\3 as replacement.
The pattern has used 3 capture groups to match characters if any between the words to be replaced. \1,\2 and \3 represent each captured group.
Here, the replacement pattern says that the matched text has to be replaced with the sequence of word4 followed by the characters captured in first capture group i.e. the characters if any between the word 1 and word2, followed by word5 and so on.