How to format a list using regex - regex

I cannot figure out how to make this work:
I have:
icon-braille icon-bookmark-empty icon-blogger icon-adult icon-address-book
And I want:
'icon-braille','icon-bookmark-empty','icon-blogger','icon-adult','icon-address-book'

This is OSX friendly sed "s/ /','/g;s/^/'/;s/$/'/":
$ echo "icon-braille icon-bookmark-empty" | sed "s/ /','/g;s/^/'/;s/$/'/"
'icon-braille','icon-bookmark-empty'

With shell and sed:
echo "'"$(sed "s/ /','/g")"'"
Example:
$ echo "'"$(sed "s/ /','/g")"'"
icon-braille icon-bookmark-empty icon-blogger icon-adult icon-address-book
'icon-braille','icon-bookmark-empty','icon-blogger','icon-adult','icon-address-book'
The first line was inserted, the second — produced.

$ echo "icon-braille icon-bookmark-empty icon-blogger icon-adult icon-address-book" | sed -e "s/ \+/','/g" -e "s/^/'/" -e "s/$/'/"
'icon-braille','icon-bookmark-empty','icon-blogger','icon-adult','icon-address-book'

You can just use the shell (assuming bash)
$ list="icon-braille icon-bookmark-empty icon-blogger icon-adult icon-address-book"
$ result=""; sep=""
$ for word in $list; do result+=$sep\'$word\'; sep=,; done
$ echo "$result"
'icon-braille','icon-bookmark-empty','icon-blogger','icon-adult','icon-address-book'

Related

sed regular expression extraction

i have a range of strings which conform to one of the two following patters:
("string with spaces",4)
or
(string_without_spaces,4)
I need to extract the "string" via a bash command, and so far have found a pattern that works for each, but not for both.
echo "(\"string with spaces\",4)" | sed -n 's/("\(.*\)",.*)/\1/ip'
output:string with spaces
echo "(string_without_spaces,4)" | sed -n 's/(\(.*\),.*)/\1/ip'
output:string_without_spaces
I have tried using "\? however it does not match the " if it is there:
echo "(SIM,0)" | sed -n 's/("\?\(.*\)"\?,.*)/\1/ip'
output: SIM
echo "(\"SIM\",0)" | sed -n 's/("\?\(.*\)"\?,.*)/\1/ip'
output: SIM"
can anyone suggest a pattern that would extract the string in both scenarios? I am not tied to sed but would prefer to not have to install perl in this environment.
How about using [^"] instead of . to exclude " to be matched.
$ echo '("string with spaces",4)' | sed -n 's/("\?\([^"]*\)"\?,.*)/\1/p'
string with spaces
$ echo "(string_without_spaces,4)" | sed -n 's/("\?\([^"]*\)"\?,.*)/\1/p'
string_without_spaces
$ echo "(SIM,0)" | sed -n 's/("\?\([^"]*\)"\?,.*)/\1/p'
SIM
$ echo '("SIM",0)' | sed -n 's/("\?\([^"]*\)"\?,.*)/\1/p'
SIM

Sed replace asterisk symbols

I'm am trying to replace a series of asterix symbols in a text file with a -999.9 using sed. However I can't figure out how to properly escape the wildcard symbol.
e.g.
$ echo "2006.0,1.0,************,-5.0" | sed 's/************/-999.9/g'
sed: 1: "s/************/-999.9/g": RE error: repetition-operator operand invalid
Doesn't work. And
$ echo "2006.0,1.0,************,-5.0" | sed 's/[************]/-999.9/g'
2006.0,1.0,-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9,-5.0
puts a -999.9 for every * which isn't what I intended either.
Thanks!
Use this:
echo "2006.0,1.0,************,-5.0" | sed 's/[*]\+/-999.9/g'
Test:
$ echo "2006.0,1.0,************,-5.0" | sed 's/[*]\+/-999.9/g'
2006.0,1.0,-999.9,-5.0
Any of these (and more) is a regexp that will modify that line as you want:
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\**/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\+/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*+/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\{12\}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*{12}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\{1,\}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*{1,}/999.9/g'
2006.0,1.0,999.9,-5.0
sed operates on regular expressions, not strings, so you need to learn regular expression syntax if you're going to use sed and in particular the difference between BREs (which sed uses by default) and EREs (which some seds can be told to use instead) and PCREs (which sed never uses but some other tools and "regexp checkers" do). Only the first solution above is a BRE that will work on all seds on all platforms. Google is your friend.
* is a regex symbol that needs to be escaped.
You can even use BASH string replacement:
s="2006.0,1.0,************,-5.0"
echo "${s/\**,/-999.9,}"
2006.0,1.0,-999.9,-5.0
Using sed:
sed 's/\*\+/999.9/g' <<< "$s"
2006.0,1.0,999.9,-5.0
Ya, * are special meta character which repeats the previous token zero or more times. Escape * in-order to match literal * characters.
sed 's/\*\*\*\*\*\*\*\*\*\*\*\*/-999.9/g'
When this possibility was introduced into gawk I have no idea!
gawk -F, '{sub(/************/,"-999.9",$3)}1' OFS=, file
2006.0,1.0,-999.9,-5.0

Bash shave a first and/or last character from string, but only if it is a certain character

In bash I need to shave a first and/or last character from string, but only if it is a certain character.
If I have | I need
/foo/bar/hah/ => foo/bar/hah
foo/bar/hah => foo/bar/hah
You can downvote me for not listing everything I've tried. But the fact is I've tried at least 35 differents sed strings and bash character stuff, many of which was from stack overflow. I simply cannot get this to happen.
what's the problem with the simple one?
sed "s/^\///;s/\/$//"
Output is
foo/bar/hah
foo/bar/hah
In pure bash :
$ var=/foo/bar/hah/
$ var=${var%/}
$ echo ${var#/}
foo/bar/hah
$
Check bash parameter expansion
or with sed :
$ sed -r 's#(^/|/$)##g' file
How about simply this:
echo "$x" | sed -e 's:^/::' -e 's:/$::'
Further to #sputnick's answer and from this answer, here's a function that would do it:
STR="/foo/bar/etc/";
STRB="foo/bar/etc";
function trimslashes {
STR="$1"
STR=${STR#"/"}
STR=${STR%"/"}
echo "$STR"
}
trimslashes $STR
trimslashes $STRB
# foo/bar/etc
# foo/bar/etc
echo '/foo/bar/hah/' | sed 's#^/##' | sed 's#/$##'
assuming the / character is the only one you're trying to remove, then sed -E 's_^[/](.*)_\1_' should do the job:
$ echo "$var1"; echo "$var2"
/foo/bar/hah
foo/bar/hah
$ echo "$var1" | sed -E 's_^[/](.*)_\1_'
foo/bar/hah
$ echo "$var2" | sed -E 's_^[/](.*)_\1_'
foo/bar/hah
if you also need to replace other characters at the start of the line, add it to the [/] class. for example, if you need to replace / or -, it would be sed -E 's_^[/-](.*)_\1_'
Here is an awk version:
echo "/foo/bar/hah/" | awk '{gsub(/^\/|\/$/,"")}1'
foo/bar/hah

sed: replace two or more tabs with one

I would like to replace double or more tabs in a string with one using sed. However, when I do
echo "A\t\tB\t\tC" | sed 's/\t\t/\t/g'
I get the same thing back
A\t\tB\t\tC
How can I get this?
A\tB\tC
Thanks in advance!
It looks like you already got it - you just missed a flag to echo => -e
$ echo -e "A\t\tB\t\tC" | sed -e 's/\t\t/\t/g'
A B C
$ echo -e "A\t\tB\t\tC"
A B C
Update: don't forget the -e on sed!

Replace string if first letter is uppercase using sed

I try to write sed answer to this question Edit a file using sed/awk using:
sed -e 's/^[A-Z]/$:$&/' file.txt
but the result is:
wednesday
$:$Weekday
$:$thursday
$:$Weekday
$:$friday
$:$Weekday
$:$saturday
$:$MaybeNot
$:$sunday
$:$MaybeNot
$:$monday
$:$Weekday
$:$tuesday
$:$Weekday
Why it replace if first character is lower case?
This is a "feature" according to this bug report caused by unexpected character ordering in the locale, further explained here and here.
$ locale
LANG=en_GB.UTF-8
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_ALL=
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" | sed -e 's/[A-Z]/./g'
..........................a.........................
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" | sed -e 's/[a-z]/./g'
.........................Z..........................
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" | LC_ALL=C sed -e 's/[A-Z]/./g'
..........................abcdefghijklmnopqrstuvwxyz
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" | LC_ALL=C sed -e 's/[a-z]/./g'
ABCDEFGHIJKLMNOPQRSTUVWXYZ..........................
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" | sed -e 's/[[:upper:]]/./g'
..........................abcdefghijklmnopqrstuvwxyz
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" | sed -e 's/[[:lower:]]/./g'
ABCDEFGHIJKLMNOPQRSTUVWXYZ..........................
$ sed --version
GNU sed version 4.2.1