how to regex replace before colon? - regex

this is my original string:
NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1
I want to only add back slash to all the spaces before ':'
so, this is what I finally want:
NetworkManager/system\ connections/Wired\ 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1
I need to do this in bash, so, sed, awk, grep are all ok for me.
I have tried following sed, but none of them work
echo NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1 | sed 's/ .*\(:.*$\)/\\ .*\1/g'
echo NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1 | sed 's/\( \).*\(:.*$\)/\\ \1.*\2/g'
echo NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1 | sed 's/ .*\(:.*$\)/\\ \1/g'
echo NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1 | sed 's/\( \).*\(:.*$\)/\\ \1\2/g'
thanks for answering my question.
I am still quite newbie to stackoverflow, I don't know how to control the format in comment.
so, I just edit my original question
my real story is:
when I do grep or use cscope to search keyword, for example "address1" under /etc folder.
the result would be like:
./NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1
if I use vim to open file under cursor, suppose my vim cursor is now at word "NetworkManager",
then vim will understand it as
"./NetworkManager/system"
that's why I want to add "\" before space, so the search result would be more vim friendly:)
I did try to change cscope's source code, but very difficult to fully achieve this. so have to do a post replacement:(

If you only want to do the replacements if there is a : present in the string, you can check if there are at least 2 columns, setting the (output)field separator to a colon.
Data:
cat file michaelvandam#Michaels-MacBook-Pro
NetworkManager/system connections/Wired 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1
NetworkManager/system connections/Wired 1.nmconnection 14 address1=10.1.10.71/24,10.1.10.1%
Example in awk:
awk 'BEGIN {FS=OFS=":"}{if(NF>1)gsub(" ","\\ ",$1)}1' file
Output
NetworkManager/system\ connections/Wired\ 1.nmconnection:14 address1=10.1.10.71/24,10.1.10.1
NetworkManager/system connections/Wired 1.nmconnection 14 address1=10.1.10.71/24,10.1.10.1

This could be simply done in awk program, with your shown samples, please try following.
awk 'BEGIN{FS=OFS=":"} {gsub(/ /,"\\\\&",$1)} 1' Input_file
Explanation: Simple explanation would be, setting field separator and output field separator as : for this program. Then in main program using gsub(Global substitution) function of awk. Where substituting space with \ in 1st field only(as per OP's remarks it should be done before :) and printing line then.

An idea for a perl one liner in bash to use \G and \K (similar #CarySwoveland's comment).
perl -pe 's/\G[^ :]*\K /\\ /g' myfile
See this demo at tio.run or a pattern demo at regex101.

This might work for you (GNU sed):
sed -E ':a;s/^([^: ]*) /\1\n/;ta;s/\n/\\ /g' file
Replace spaces before : by newlines then replace newlines by \ 's.
Alternative using the hold space:
sed -E 's/:/\n:/;h;s/ /\\ /g;G;s/\n.*\n//' file
Split the line on the first :.
Amend the front section, remove the middle and append the unadulterated back section.

My answer is ugly and I think RavinderSingh13's answer is THE ONE, but I already took the time to write mine and it works (It's written step by step, but it's a one line command):
I got inspired by HatLess answer:
first get the text before the : with cut (I put the string in a file to make it easy to read, but this works on echo):
cut -d':' -f1 infile
Then replace spaces using sed:
cut -d':' -f1 infile | sed 's/\([a-z]\) /\1\\ /g'
Then echo the output with no new line:
echo -n "$(cut -d':' -f1 infile | sed -e 's/\([a-z]\) /\1\\ /g')"
Add the missing : and what comes after it:
echo -n "$(cut -d':' -f1 infile | sed -e 's/\([a-z]\) /\1\\ /g')" | cat - <(echo -n :) | cat - <(cut -d':' -f2 infile)

Related

How to remove special characters like a single quote from a string?

Using Sed I tried but it did not worked out.
Basically, I have a string say:-
Input:-
'http://www.google.com/photos'
Output required:-
http://www.google.com
I tried using sed but escaping ' is not possible.
what i did was:-
sed 's/\'//' | sed 's/photos//'
sed for photos worked but for ' it didn't.
Please suggest what can be the solution.
Escaping ' in sed is possible via a workaround:
sed 's/'"'"'//g'
# |^^^+--- bash string with the single quote inside
# | '--- return to sed string
# '------- leave sed string and go to bash
But for this job you should use tr:
tr -d "'"
Perl Replacements have a syntax identical to sed, works better than sed, is installed almost in every system by default and works for all machines the same way (portability):
$ echo "'http://www.google.com/photos'" |perl -pe "s#\'##g;s#(.*//.*/)(.*$)#\1#g"
http://www.google.com/
Mind that this solution will keep only the domain name with http in front, discarding all words following http://www.google.com/
If you want to do it with sed , you can use sed "s/'//g" as advised by Wiktor Stribiżew in comments.
PS: I sometimes refer to special chars with their ascii hex code of the special char as advised by man ascii, which is \x27 for '
So for sed you can do it:
$ echo "'http://www.google.com/photos'" |sed -r "s#'##g; s#(.*//.*/)(.*$)#\1#g;"
http://www.google.com/
# sed "s#\x27##g' will also remove the single quote using hex ascii code.
$ echo "'http://www.google.com/photos'" |sed -r "s#'##g; s#(.*//.*)(/.*$)#\1#g;"
http://www.google.com #Without the last slash
If your string is stored in a variable, you can achieve above operations with pure bash, without the need of external tools like sed or perl like this:
$ a="'http://www.google.com/photos'" && a="${a:1:-1}" && echo "$a"
http://www.google.com/photos
# This removes 1st and last char of the variable , whatever this char is.
$ a="'http://www.google.com/photos'" && a="${a:1:-1}" && echo "${a%/*}"
http://www.google.com
#This deletes every char from the end of the string up to the first found slash /.
#If you need the last slash you can just add it to the echo manually like echo "${a%/*}/" -->http://www.google.com/
It's unclear if the ' are actually around your string, although this should take care it:
str="'http://www.google.com/photos'"
echo "$str" | sed s/\'//g | sed 's/\/photos//g'
Combined:
echo "$str" | sed -e "s/'//g" -e 's/\/photos//g'
Using tr:
echo "$str" | sed -e "s/\/photos//g" | tr -d \'
Result:
http://www.google.com
If the single quotes are not around your string it should work regardless.

How to remove a space between matching words?

I've read a lot of questions about how to replace spaces from a file but I have the following problem:
I have a file like so:
<foo>"crazy foo"</foo> <bar>dull-bar</bar>
and I'm trying to remove spaces between > < and only those ones so the file would be like:
`<foo>"crazy foo"</foo><bar>dull-bar</bar>`
So far I've tried to remove then by using sed and tr. Sed is not working by any chance and using tr '> <' '><' outputs:
<foo>"crazy foo"</foo><<bar>dull-bar</bar>
sed -i -e "s/> *</></g" YourFile
-i means YourFile is modified. Remove this option to test your command and display the result in shell output.
* matches n spaces.
The g at the end of sed expression means "Replace all the occurrences".
You could try something like this
echo "<foo>"crazy foo"</foo> <bar>dull-bar</bar>" | sed 's/>[[:space:]]*</></g '
awk -F"\"" '{print $3}' file.txt | sed 's/ //g'

Sed : print all lines after match

I got my research result after using sed :
zcat file* | sed -e 's/.*text=\(.*\)status=[^/]*/\1/' | cut -f 1 - | grep "pattern"
But it only shows the part that I cut. How can I print all lines after a match ?
I'm using zcat so I cannot use awk.
Thanks.
Edited :
This is my log file :
[01/09/2015 00:00:47] INFO=54646486432154646 from=steve idfrom=55516654455457 to=jone idto=5552045646464 guid=100021623456461451463 n
um=6 text=hi my number is 0 811 22 1/12 status=new survstatus=new
My aim is to find all users that spam my site with their telephone numbers (using grep "pattern") then print all the lines to get all the information about each spam. The problem is there may be matches in INFO or id, so I use sed to get the text first.
Printing all lines after a match in sed:
$ sed -ne '/pattern/,$ p'
# alternatively, if you don't want to print the match:
$ sed -e '1,/pattern/ d'
Filtering lines when pattern matches between "text=" and "status=" can be done with a simple grep, no need for sed and cut:
$ grep 'text=.*pattern.* status='
You can use awk
awk '/pattern/,EOF'
n.b. don't be fooled: EOF is just an uninitialized variable, and by default 0 (false). So that condition cannot be satisfied until the end of file.
Perhaps this could be combined with all the previous answers using awk as well.
Maybe this is what you actually want? Find lines matching "pattern" and extract the field after text= up through just before status=?
zcat file* | sed -e '/pattern/s/.*text=\(.*\)status=[^/]*/\1/'
You are not revealing what pattern actually is -- if it's a variable, you cannot use single quotes around it.
Notice that \(.*\)status=[^/]* would match up through survstatus=new in your example. That is probably not what you want? There doesn't seem to be a status= followed by a slash anywhere -- you really should explain in more detail what you are actually trying to accomplish.
Your question title says "all line after a match" so perhaps you want everything after text=? Then that's simply
sed 's/.*text=//'
i.e. replace up through text= with nothing, and keep the rest. (I trust you can figure out how to change the surrounding script into zcat file* | sed '/pattern/s/.*text=//' ... oops, maybe my trust failed.)
The seldom used branch command will do this for you. Until you match, use n for next then branch to beginning. After match, use n to skip the matching line, then a loop copying the remaining lines.
cat file | sed -n -e ':start; /pattern/b match;n; b start; :match n; :copy; p; n ; b copy'
zcat file* | sed -e 's/.*text=\(.*\)status=[^/]*/\1/' | ***cut -f 1 - | grep "pattern"***
instead change the last 2 segments of your pipeline so that:
zcat file* | sed -e 's/.*text=\(.*\)status=[^/]*/\1/' | **awk '$1 ~ "pattern" {print $0}'**

(GNU)Sed: how to replace any character from nth character to nth+10?

I need to replace characters from 10th to 20th in the string which looks like that:
123456789012345678901234567890
So far I've tried:
a)
Works for the 10th character ONLY:
echo "123456789012345678901234567890" | sed 's/./X/10'
b)
Doesn't work on the range:
echo "123456789012345678901234567890" | sed 's/./X/10,20'
echo "123456789012345678901234567890" | sed 's/./X/10\,20'
echo "123456789012345678901234567890" | sed 's/./X/\{10,20\}'
echo "123456789012345678901234567890" | sed 's/./X/\{10\,20\}'
Does not work and I get error
unknown option to `s'
So - the question is - how do I make this to work:
echo "123456789012345678901234567890" | sed 's/./X/10,20'
Try:
$ sed -r "s/^(.{9})(.{11})/\1XXXXXXXXXX/" <<< 123456789012345678901234567890
123456789XXXXXXXXXX1234567890
It is a complex sed problem, I could just find this solution:
$ sed 's/^\(.\{10\}\)\(.\{10\}\)/\1XXXXXXXXXX/' <<< 123456789012345678901234567890
1234567890XXXXXXXXXX1234567890
With awk it looks nicer:
$ awk 'BEGIN{FS=OFS=""} {for (i=10;i<=20;i++) $i="X"} {print}' <<< 123456789012345678901234567890
123456789XXXXXXXXXXX1234567890
You can do it with bash parameter substitution like this:
#!/bin/bash
s="123456789012345678901234567890"
l=${s:0:9} # Extract left part
m=${s:10:11} # Extract middle part
r=${s:20} # Extract right part
# Diddle with middle part to your heart's content and re-assemble "$l$m$r" when done
m=$(sed 's/./X/g' <<<$m)
See here for more explanation and examples.
Or, you can do this:
transform the row of letters into a column so each is on its own line
apply your edits to LINES 10 through 20 (as opposed to characters 10 through 20)
transform column of letters back into a row (by deleting linefeeds)
as shown in the one-liner below:
$ echo "123456789012345678901234567890" | sed "s/\(.\)/\1\n/g" | sed "10,20s/./X/" | tr -d "\n"
I know, that it looks ugly, but:
echo "123456789012345678901234567890" | \
sed 's/^\(.\{10\}\).\{10\}\(.*\)/\1XXXXXXXXXX\2/'
Without placing multiple X in sed command:
sed -r 's/^(.{9})(.{10,20})(.*)$/\1\n\2\n\3/' | sed -e '2s/./X/g' -e 'N;N;s/\n//g'
To replace the 10th to 20th characters, inclusive, try:
echo 123456789012345678901234567890 | sed 's/\(.\{9\}\).\{11\}/\1XXXXXXXXXX/'
123456789XXXXXXXXXX1234567890
With the GNU sed, you can use the -r switch to remove most of the backslashes:
echo 123456789012345678901234567890 | sed -r 's/(.{9}).{11}/\1XXXXXXXXXX/'
Or the naive approach also works here:
echo 123456789012345678901234567890 | sed 's/\(.........\).........../\1XXXXXXXXXX/'
This might work for you (GNU sed):
sed ':a;/.\{9\}X\{11\}/!s/\(.\{9\}X*\)./\1X/;ta' file
or with a bit of syntactic sugar:
sed -r ':a;/.{9}X{11}/!s/(.{9}X*)./\1X/;ta' file

Regex replacing hyphens in attributes name only

I have a string that looks something like this.
<tag-name i-am-an-attribute="123" and-me-too="321">
All I want to do is replace the dashes into an underscore, but the tag-name should remain like it is.
Hope there are some regex guru's who can help me out.
[solution]
In case someone needs this.
I ended up with a perl oneliner command
echo '<tag-name i-am-an-attribute="123" and-me-too="321">' | perl -pe 's/( \K[^*"]*)-/$1_/g;' | perl -pe 's/ / /g;'
results in
<tag-name i_am_an_attribute="123" and_me_too="321">
Using sed:
sed ':l;s/-\([^- ]*\)\( *=\)/_\1\2/g;tl' input
Gives:
<tag-name i_am_an_attribute="123" and_me_too="321">
With <tag-name i-am-an-attribute="123" and-me-too="321"> as a line in a file:-
read -r < file
fullstring=$(echo "${REPLY}" | sed s'#-name #-name:#')
field1=$(echo "${fullstring}" | cut -d':' f1)
field2=$(echo "${fullstring}" | cut -d':' f2)
fixedfield=$(echo "${field2}" | sed s'#-#_#'g)
echo "${field1} ${fixedfield}"
I'm discovering that the most important thing, with scripting, is to provide yourself anchors within the text, that you can use to cut it up into segments that you can then perform operations on. Try to format your text as actual fields with seperators; it makes life a lot easier.