Sed usage, remove / change text

Sed usage, remove / change text - regex

I have an string that generated as following
OK::82.44.127.13:GB:UNITED KINGDOM:ENGLAND:WOKING:-:51.000:-0.55813:+01:00
i need a sed string to basically change to
82.44.127.13;GB UNITED KINGDOM ENGLAND WOKING;51.000;-0.55813
i think this will require multi runs of sed, so far i have:
sed 's/:-:/;/g' (which will change the :-: to a ;)
sed 's/:-/;-/g' (which will change the :- to a ;)
sed s/OK:://g (which will strip the OK::)
but im stuck on how to change the : between the ip address and location to a ;, and then remove all other :, and also strip off the time?
(note not really ip addresses)

With awk it can be easier:
$ awk -F ":" '{print $3";"$4" "$5" "$6" "$7";"$9";"$10}' <<< "OK::82.44.127.13:GB:UNITED KINGDOM:ENGLAND:WOKING:-:51.000:-0.55813:+01:00"
82.44.127.13;GB UNITED KINGDOM ENGLAND WOKING;51.000;-0.55813
Note that I am printing things like $3";"$4" "$5 because the field separator is changing. Otherwise we could use: BEGIN{OFS=";"}
$ awk -F ":" 'BEGIN{OFS=";"}{print $3,$4,$5,$6,$7,$9,$10}' <<< "OK::82.44.127.13:GB:UNITED KINGDOM:ENGLAND:WOKING:-:51.000:-0.55813:+01:00"
82.44.127.13;GB;UNITED KINGDOM;ENGLAND;WOKING;51.000;-0.55813

This will do everything in one sed command:
$ echo "OK::82.44.127.13:GB:UNITED KINGDOM:ENGLAND:WOKING:-:51.000:-0.55813:+01:00" | \
sed -r 's/OK::([0-9.]*):([A-Z ]*):([A-Z ]*):([A-Z ]*):([A-Z ]*):-:(-?[0-9.]*):(-?[0-9.]*):.*/\1;\2 \3 \4 \5;\6;\7/'
82.44.127.13;GB UNITED KINGDOM ENGLAND WOKING;51.000;-0.55813

Without awk:
cut -d: -f 3-7,9,10 | tr ":" ";" | sed -r 's/([A-Z]);([A-Z])/\1 \2/g'

Related

how to regex replace before colon?

this is my original string:
NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1
I want to only add back slash to all the spaces before ':'
so, this is what I finally want:
NetworkManager/system\ connections/Wired\ 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1
I need to do this in bash, so, sed, awk, grep are all ok for me.
I have tried following sed, but none of them work
echo NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1 | sed 's/ .*\(:.*$\)/\\ .*\1/g'
echo NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1 | sed 's/\( \).*\(:.*$\)/\\ \1.*\2/g'
echo NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1 | sed 's/ .*\(:.*$\)/\\ \1/g'
echo NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1 | sed 's/\( \).*\(:.*$\)/\\ \1\2/g'
thanks for answering my question.
I am still quite newbie to stackoverflow, I don't know how to control the format in comment.
so, I just edit my original question
my real story is:
when I do grep or use cscope to search keyword, for example "address1" under /etc folder.
the result would be like:
./NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1
if I use vim to open file under cursor, suppose my vim cursor is now at word "NetworkManager",
then vim will understand it as
"./NetworkManager/system"
that's why I want to add "\" before space, so the search result would be more vim friendly:)
I did try to change cscope's source code, but very difficult to fully achieve this. so have to do a post replacement:(

If you only want to do the replacements if there is a : present in the string, you can check if there are at least 2 columns, setting the (output)field separator to a colon.
Data:
cat file michaelvandam#Michaels-MacBook-Pro
NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1
NetworkManager/system connections/Wired 1.nmconnection 14 address1=10.1.10.71/24,10.1.10.1%
Example in awk:
awk 'BEGIN {FS=OFS=":"}{if(NF>1)gsub(" ","\\ ",$1)}1' file
Output
NetworkManager/system\ connections/Wired\ 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1
NetworkManager/system connections/Wired 1.nmconnection 14 address1=10.1.10.71/24,10.1.10.1

This could be simply done in awk program, with your shown samples, please try following.
awk 'BEGIN{FS=OFS=":"} {gsub(/ /,"\\\\&",$1)} 1' Input_file
Explanation: Simple explanation would be, setting field separator and output field separator as : for this program. Then in main program using gsub(Global substitution) function of awk. Where substituting space with \ in 1st field only(as per OP's remarks it should be done before :) and printing line then.

An idea for a perl one liner in bash to use \G and \K (similar #CarySwoveland's comment).
perl -pe 's/\G[^ :]*\K /\\ /g' myfile
See this demo at tio.run or a pattern demo at regex101.

This might work for you (GNU sed):
sed -E ':a;s/^([^: ]*) /\1\n/;ta;s/\n/\\ /g' file
Replace spaces before : by newlines then replace newlines by \ 's.
Alternative using the hold space:
sed -E 's/:/\n:/;h;s/ /\\ /g;G;s/\n.*\n//' file
Split the line on the first :.
Amend the front section, remove the middle and append the unadulterated back section.

My answer is ugly and I think RavinderSingh13's answer is THE ONE, but I already took the time to write mine and it works (It's written step by step, but it's a one line command):
I got inspired by HatLess answer:
first get the text before the : with cut (I put the string in a file to make it easy to read, but this works on echo):
cut -d':' -f1 infile
Then replace spaces using sed:
cut -d':' -f1 infile | sed 's/\([a-z]\) /\1\\ /g'
Then echo the output with no new line:
echo -n "$(cut -d':' -f1 infile | sed -e 's/\([a-z]\) /\1\\ /g')"
Add the missing : and what comes after it:
echo -n "$(cut -d':' -f1 infile | sed -e 's/\([a-z]\) /\1\\ /g')" | cat - <(echo -n :) | cat - <(cut -d':' -f2 infile)

unexpected result by cutting the last column with sed

echo '60 test' | sed -r 's/(.*)\s+[^\s]+$/\1/'
result:
60 test
the last column is not cut. but it works pretty well with
echo '60 home' | sed -r 's/(.*)\s+[^\s]+$/\1/'
result:
60
why?

[^\s]+ means not backslash or s repeated 1 or more times and test contains s while home does not and so the latter matches the regexp while the former doesn't.
You should have used either of these instead to match non-space:
$ echo '60 test' | sed -r 's/(.*)\s+\S+$/\1/'
60
$ echo '60 test' | sed -r 's/(.*)\s+[^[:space:]]+$/\1/'
60
As #potong suggested in a comment, to remove the last column with sed all you really need is:
sed -E 's/\s+\S+$//'
I switched from -r to -E as -r is GNU sed only while -E is GNU or OSX/BSD sed so it's generally the better option to use BUT OSX/BSD sed won't recognize \s or \S so changing from -r to -E doesn't really make the script more portable in this case, you'd have to use this instead:
sed -E 's/[[:space:]]+[^[:space:]]+//'
and then to be completely portable to all POSIX seds it'd be:
sed 's/[[:space:]]\{1,\}[^[:space:]]\{1,\}//'
or this would behave the same if there's always 2 or more fields:
sed 's/[[:space:]]*[^[:space:]]*//'

If you are just printing the first part of your string before the space without doing any other modification, you can simply use cut
echo '60 test' | cut -d' ' -f1
60
where you define your delimiter (-d) and the field (-f) you want to select.
No need to go for a complex solution using sed and doing some replacement operations.
With awk you can also print the first field:
echo '60 test' | awk '{print $1}'
60
or via grep in perl mode to have the \s taken into account
echo '60 test' | grep -oP '^.*?(?=\s)'
60

How to remove special characters like a single quote from a string?

Using Sed I tried but it did not worked out.
Basically, I have a string say:-
Input:-
'http://www.google.com/photos'
Output required:-
http://www.google.com
I tried using sed but escaping ' is not possible.
what i did was:-
sed 's/\'//' | sed 's/photos//'
sed for photos worked but for ' it didn't.
Please suggest what can be the solution.

Escaping ' in sed is possible via a workaround:
sed 's/'"'"'//g'
# |^^^+--- bash string with the single quote inside
# | '--- return to sed string
# '------- leave sed string and go to bash
But for this job you should use tr:
tr -d "'"

Perl Replacements have a syntax identical to sed, works better than sed, is installed almost in every system by default and works for all machines the same way (portability):
$ echo "'http://www.google.com/photos'" |perl -pe "s#\'##g;s#(.*//.*/)(.*$)#\1#g"
http://www.google.com/
Mind that this solution will keep only the domain name with http in front, discarding all words following http://www.google.com/
If you want to do it with sed , you can use sed "s/'//g" as advised by Wiktor Stribiżew in comments.
PS: I sometimes refer to special chars with their ascii hex code of the special char as advised by man ascii, which is \x27 for '
So for sed you can do it:
$ echo "'http://www.google.com/photos'" |sed -r "s#'##g; s#(.*//.*/)(.*$)#\1#g;"
http://www.google.com/
# sed "s#\x27##g' will also remove the single quote using hex ascii code.
$ echo "'http://www.google.com/photos'" |sed -r "s#'##g; s#(.*//.*)(/.*$)#\1#g;"
http://www.google.com #Without the last slash
If your string is stored in a variable, you can achieve above operations with pure bash, without the need of external tools like sed or perl like this:
$ a="'http://www.google.com/photos'" && a="${a:1:-1}" && echo "$a"
http://www.google.com/photos
# This removes 1st and last char of the variable , whatever this char is.
$ a="'http://www.google.com/photos'" && a="${a:1:-1}" && echo "${a%/*}"
http://www.google.com
#This deletes every char from the end of the string up to the first found slash /.
#If you need the last slash you can just add it to the echo manually like echo "${a%/*}/" -->http://www.google.com/

It's unclear if the ' are actually around your string, although this should take care it:
str="'http://www.google.com/photos'"
echo "$str" | sed s/\'//g | sed 's/\/photos//g'
Combined:
echo "$str" | sed -e "s/'//g" -e 's/\/photos//g'
Using tr:
echo "$str" | sed -e "s/\/photos//g" | tr -d \'
Result:
http://www.google.com
If the single quotes are not around your string it should work regardless.

(GNU)Sed: how to replace any character from nth character to nth+10?

I need to replace characters from 10th to 20th in the string which looks like that:
123456789012345678901234567890
So far I've tried:
a)
Works for the 10th character ONLY:
echo "123456789012345678901234567890" | sed 's/./X/10'
b)
Doesn't work on the range:
echo "123456789012345678901234567890" | sed 's/./X/10,20'
echo "123456789012345678901234567890" | sed 's/./X/10\,20'
echo "123456789012345678901234567890" | sed 's/./X/\{10,20\}'
echo "123456789012345678901234567890" | sed 's/./X/\{10\,20\}'
Does not work and I get error
unknown option to `s'
So - the question is - how do I make this to work:
echo "123456789012345678901234567890" | sed 's/./X/10,20'

Try:
$ sed -r "s/^(.{9})(.{11})/\1XXXXXXXXXX/" <<< 123456789012345678901234567890
123456789XXXXXXXXXX1234567890

It is a complex sed problem, I could just find this solution:
$ sed 's/^\(.\{10\}\)\(.\{10\}\)/\1XXXXXXXXXX/' <<< 123456789012345678901234567890
1234567890XXXXXXXXXX1234567890
With awk it looks nicer:
$ awk 'BEGIN{FS=OFS=""} {for (i=10;i<=20;i++) $i="X"} {print}' <<< 123456789012345678901234567890
123456789XXXXXXXXXXX1234567890

You can do it with bash parameter substitution like this:
#!/bin/bash
s="123456789012345678901234567890"
l=${s:0:9} # Extract left part
m=${s:10:11} # Extract middle part
r=${s:20} # Extract right part
# Diddle with middle part to your heart's content and re-assemble "$l$m$r" when done
m=$(sed 's/./X/g' <<<$m)
See here for more explanation and examples.
Or, you can do this:
transform the row of letters into a column so each is on its own line
apply your edits to LINES 10 through 20 (as opposed to characters 10 through 20)
transform column of letters back into a row (by deleting linefeeds)
as shown in the one-liner below:
$ echo "123456789012345678901234567890" | sed "s/\(.\)/\1\n/g" | sed "10,20s/./X/" | tr -d "\n"

I know, that it looks ugly, but:
echo "123456789012345678901234567890" | \
sed 's/^\(.\{10\}\).\{10\}\(.*\)/\1XXXXXXXXXX\2/'

Without placing multiple X in sed command:
sed -r 's/^(.{9})(.{10,20})(.*)$/\1\n\2\n\3/' | sed -e '2s/./X/g' -e 'N;N;s/\n//g'

To replace the 10th to 20th characters, inclusive, try:
echo 123456789012345678901234567890 | sed 's/\(.\{9\}\).\{11\}/\1XXXXXXXXXX/'
123456789XXXXXXXXXX1234567890
With the GNU sed, you can use the -r switch to remove most of the backslashes:
echo 123456789012345678901234567890 | sed -r 's/(.{9}).{11}/\1XXXXXXXXXX/'
Or the naive approach also works here:
echo 123456789012345678901234567890 | sed 's/\(.........\).........../\1XXXXXXXXXX/'

This might work for you (GNU sed):
sed ':a;/.\{9\}X\{11\}/!s/\(.\{9\}X*\)./\1X/;ta' file
or with a bit of syntactic sugar:
sed -r ':a;/.{9}X{11}/!s/(.{9}X*)./\1X/;ta' file

Regular expression to extract a percentage

I have strings like the following: blabla a13724bla-bla244 35%
Notice that there is always a space before the percentage. I would like to extract the percentage number (so, without the %) from these strings using the Linux shell.

Assuming you have GNU grep:
$ grep -oP '\d+(?=%)' <<< "blabla a13724bla-bla244 35%"
35

Using sed:
echo blabla a13724bla-bla244 35% | sed 's/.*[ \t][ \t]*\([0-9][0-9]*\)%.*/\1/'
If you expect to have multiple percentages in a line then:
echo blabla 20% a13724bla-bla244 35% | \
sed -e 's/[^%0-9 ]*//g;s/ */\n/g' | sed -n '/%/p'

You can try this
echo "blabla a13724bla-bla244 35%" | cut -d' ' -f3 | sed 's/\%//g'
NOTE: Assumption is the input is always in this format and percentage is 3rd token separated by space.

You may try this regular expression:
/\s(\d+%)/

Use this regular expression:
\s(\d{1,3})%
If you need it in shell, you can use sed or this perl one-liner:
echo "blah 35%" | perl -pe "s/.*\s(\d{1,3})%/\1/g"
35

If you always have a number of continuous columns maybe you should try with awk instead of a regular expresion.
cat file.txt |awk '{print $3}' |cut -d "%" -f 1
With this code you obtain the third column.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Sed usage, remove / change text - regex

Without awk: cut -d: -f 3-7,9,10 | tr ":" ";" | sed -r 's/([A-Z]);([A-Z])/\1 \2/g'

Related

how to regex replace before colon?

unexpected result by cutting the last column with sed

How to remove special characters like a single quote from a string?

(GNU)Sed: how to replace any character from nth character to nth+10?

Regular expression to extract a percentage

Categories

Resources