using sed command in substitution - regex

When I use this command
echo jones:Adrian W. Jones/OSD211/555-0123 | sed -e 's=^([^:]*):[^/]*/([^/]*)/.*$=\1:\2='
I got: sed: command garbled: s=^([^:]):[^/]/([^/])/.$=\1:\2=
but when I use this one:
echo jones:Adrian W. Jones/OSD211/555-0123 | sed -e 's=^\([^:]*\):[^/]*/\([^/]*\)/.*$=\1:\2='
I got: jones:OSD211
Why should I escape the ( in sed?

by default, sed uses BRE. in BRE, ( is literal (, you have to escape it to bring it special meaning (grouping)
p.s. Gnu sed has -r option to enable ERE.

Related

unterminated `s' command, can't find my mistake

sudo wbinfo --group-info GROUPNAME| sed -r -e 's/(?:DOMAIN\\(\w+),?)|(?:[^]+:)/$1/g'
This command results in an
sed: -e expression #1, char 36: unterminated `s' command
The output of
sudo wbinfo --group-info GROUPNAME
is like
GROUPNAME:x:0123456789:DOMAIN\user1,DOMAIN\user2,DOMAIN\user3,...,DOMAIN\userN
I tried escaping all instances of ( with \(, \ with \\ (also \\ with \\\\)
sudo wbinfo --group-info GROUPNAME| sed -r -e s/'(?:DOMAIN\\(\w+),?)|(?:[^]+:)'/$1/g
(changed quoted area)
sudo wbinfo --group-info GROUPNAME| sed -r -e s/'(?:DOMAIN\\(\w+),?)|(?:[^]+:)/\1/g'
(\1 instead of $1)
I still don't know how to get what I need:
user1 user2 user3 ... userN
TL;TR
Your attempt is too complicated, you can simply use this:
sed -r 's/[^\]+DOMAIN\\([[:alnum:]]+)/\1 /g'
About the syntax error:
You are using sed -r which enables extended posix regular expressions. Note that in extended posix regular expressions the ? is used as a quantifier for optional repetition. You you need to escape it:
sed -r -e 's/(\?:DOMAIN\\(\w+),\?)|(\?:[^]+:)/$1/g'
However, there is still a problem left with the regex: you are using [^]. Note that the ^ when used in a character class, negates the match of that class. You are using the ^ but missed to say which characters should not matched. You need to put in something like:
sed -r -e 's/(\?:DOMAIN\\(\w+),\?)|(\?:[^abc]+:)/$1/g'
awk to the rescue!
$ ... | awk -F'\\\\' -v RS=, '{print $2}'
will give the result one user per line, if you want them to appear on a single line add ... | xargs
Here's another approach with sed:
sed -r -e 's/^.*://' -e 's/[^,]+\\//g' -e 's/,/ /g'
First remove all the stuff before the last colon in the line,
then remove all the domain parts (non-commas followed by a backslash),
then change commas to spaces.

Sed replace asterisk symbols

I'm am trying to replace a series of asterix symbols in a text file with a -999.9 using sed. However I can't figure out how to properly escape the wildcard symbol.
e.g.
$ echo "2006.0,1.0,************,-5.0" | sed 's/************/-999.9/g'
sed: 1: "s/************/-999.9/g": RE error: repetition-operator operand invalid
Doesn't work. And
$ echo "2006.0,1.0,************,-5.0" | sed 's/[************]/-999.9/g'
2006.0,1.0,-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9,-5.0
puts a -999.9 for every * which isn't what I intended either.
Thanks!
Use this:
echo "2006.0,1.0,************,-5.0" | sed 's/[*]\+/-999.9/g'
Test:
$ echo "2006.0,1.0,************,-5.0" | sed 's/[*]\+/-999.9/g'
2006.0,1.0,-999.9,-5.0
Any of these (and more) is a regexp that will modify that line as you want:
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\**/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\+/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*+/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\{12\}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*{12}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\{1,\}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*{1,}/999.9/g'
2006.0,1.0,999.9,-5.0
sed operates on regular expressions, not strings, so you need to learn regular expression syntax if you're going to use sed and in particular the difference between BREs (which sed uses by default) and EREs (which some seds can be told to use instead) and PCREs (which sed never uses but some other tools and "regexp checkers" do). Only the first solution above is a BRE that will work on all seds on all platforms. Google is your friend.
* is a regex symbol that needs to be escaped.
You can even use BASH string replacement:
s="2006.0,1.0,************,-5.0"
echo "${s/\**,/-999.9,}"
2006.0,1.0,-999.9,-5.0
Using sed:
sed 's/\*\+/999.9/g' <<< "$s"
2006.0,1.0,999.9,-5.0
Ya, * are special meta character which repeats the previous token zero or more times. Escape * in-order to match literal * characters.
sed 's/\*\*\*\*\*\*\*\*\*\*\*\*/-999.9/g'
When this possibility was introduced into gawk I have no idea!
gawk -F, '{sub(/************/,"-999.9",$3)}1' OFS=, file
2006.0,1.0,-999.9,-5.0

How to ignore word delimiters in sed

So I have a bash script which is working perfectly except for one issue with sed.
full=$(echo $full | sed -e 's/\b'$first'\b/ /' -e 's/ / /g')
This would work great except there are instances where the variable $first is preceeded immediately by a period, not a blank space. In those instances, I do not want the variable removed.
Example:
full="apple.orange orange.banana apple.banana banana";first="banana"
full=$(echo $full | sed -e 's/\b'$first'\b/ /' -e 's/ / /g')
echo $first $full;
I want to only remove the whole word banana, and not make any change to orange.banana or apple.banana, so how can I get sed to ignore the dot as a delimiter?
You want "banana" that is preceded by beginning-of-string or a space, and followed by a space or end-of-string
$ sed -r 's/(^|[[:blank:]])'"$first"'([[:blank:]]|$)/ /g' <<< "$full"
apple.orange orange.banana apple.banana
Note the use of -r option (for bsd sed, use -E) that enables extended regular expressions -- allow us to omit a lot of backslashes.

Why does sed -e 's/[ +-]?[0-9]*\.[0-9]*//g' not work?

I want to remove all floating point numbers from a string using sed. Therefore I use
sed -e 's/[ +-]?[0-9]*\.[0-9]*//g'
But it does not work:
echo 1.2456 | sed -e 's/[ +-]?[0-9]*\.[0-9]*//g'
gives 1.2456. If I remove the [ +-]? block, it works for positive numbers.
You need to escape the question mark:
echo 1.2456 | sed -e 's/[ +-]\?[0-9]*\.[0-9]*//g'
The ? sign is an extended regex character. sed needs to be called with the -r option to enable the extended expressions.
escape ?
or sed -r
then it should work.
This version is more comparable. ? doesn't work on all systems, and + can mean repeat once or more.
echo 1.2456 | sed -e 's/[ \+\-]*[0-9]*\.[0-9]*//g'

How can I insert a tab character with sed on OS X?

I have tried:
echo -e "egg\t \t\t salad" | sed -E 's/[[:blank:]]+/\t/g'
Which results in:
eggtsalad
And...
echo -e "egg\t \t\t salad" | sed -E 's/[[:blank:]]+/\\t/g'
Which results in:
egg\tsalad
What I would like:
egg salad
Try: Ctrl+V and then press Tab.
Use ANSI-C style quoting: $'string'
sed $'s/foo/\t/'
So in your example, simply add a $:
echo -e "egg\t \t\t salad" | sed -E $'s/[[:blank:]]+/\t/g'
OSX's sed only understands \t in the pattern, not in the replacement doesn't understand \t at all, since it's essentially the ancient 4.2BSD sed left over from 1982 or thenabouts. Use a literal tab (which in bash and vim is Ctrl+V, Tab), or install GNU coreutils to get a more reasonable sed.
Another option is to use $(printf '\t') to insert a tab, e.g.:
echo -e "egg\t \t\t salad" | sed -E "s/[[:blank:]]+/$(printf '\t')/g"
try awk
echo -e "egg\t \t\t salad" | awk '{gsub(/[[:blank:]]+/,"\t");print}'
A workaround for tab on osx is to use "\ ", an escape char followed by four spaces.
If you are trying to find the last instance of a pattern, say a " })};" and insert a file on a newline after that pattern, your sed command on osx would look like this:
sed -i '' -e $'/^\ \})};.*$/ r fileWithTextIWantToInsert' FileIWantToChange
The markup makes it unclear: the escape char must be followed by four spaces in order for sed to register a tab character on osx.
The same trick works if the pattern you want to find is preceded by two spaces, and I imagine it will work for finding a pattern preceded by any number of spaces as well.