bash: translate a string to replace combination of 'character+whitespace' with comma - regex

I am trying to translate(tr) a string to replace combination of two characters with comma.
string:-
input is "test-1 - test-2 - test-3"
desired output is "test-1 ,test-2 ,test-3"
To achieve this I need to replace " -" [space + '-'] with comma [,]
I tried the below options
$ echo "test-1 - test-2 - test-3" | tr '-[:space:]' ','
$ echo "test-1 - test-2 - test-3" | tr '- ' ','
but throwing an error?, it works for a combination of any other two charachters but not with space?

You can use sed instead of tr to achieve this:
$ echo "test-1 - test-2 - test-3" | sed "s/ - / ,/g"

Related

bash sed replace from first ocurrence to right

I am trying to obtain the first word before /
I am using following sed:
echo 'a/b/c' | sed 's/\(.*\)\/\(.*\)/\1/g'
But this gives me a/b, I would like it to give me only a
Any ideas?
You can use
echo 'a/b/c' | sed 's,\([^/]*\)/.*,\1,'
Details:
\([^/]*\) - Group 1 (\1): any zero or more chars other than /
/ - a / char
.* - the rest of the string.
Or, if you have a variable, you can use string manipulation:
s='a/b/c'
echo "${s%%/*}"
# => a
Here, %% removes the longest substring from the end, that matches the /* glob pattern, up to the first / in the string including it.
This can be done easily in bash itself without calling any external utility:
s='a/b/c'
echo "${s%%/*}"
a
# or else
echo "${s/\/*}"
a
Using sed and an alternate delimiter to prevent a conflict with a similar char in the data.
$ echo 'a/b/c' | sed 's#/.*##'
a
Use cut instead of sed to isolate char-separated fields for clarity, simplicity, and efficiency:
$ echo 'a/b/c' | cut -d'/' -f1
a
$ echo 'a/b/c/d/e/f/g/h/i' | cut -d'/' -f5
e

Rename all files within a folder to include checksum in the file name on Unix/Mac

I want to rename a bunch of images to include the SHA checksum in the file name. So, for example, this
twitter-icon.png
facebook-icon.png
linkedin-icon.png
becomes this
twitter-icon.23rjvn28374ughf1i2je72392qdh2jf.png
facebook-icon.89394udjnx2ebh28hdb8eghddgbxn3d.png
linkedin-icon.j399hdd83gh28bdb2nedudhdn299dhj.png
The closest I've come is this command
shasum * | sed -e 's/\([^ ]*\) \(.*\(\..*\)\)$/mv -v \2 \2\1\3/' | sh
It gives almost the desired result with one "but" - it preserves the file extension in the generated name, like so
twitter-icon.png23rjvn28374ughf1i2je72392qdh2jf.png
^^^
How can I get rid of that extension in the middle and get the clean image name before the checksum suffix?
I think you want this:
shasum * | sed -e 's/\([^ ]*\) *\(.*\)\.\(.*\)$/mv -v "\2.\3" "\2.\1.\3"/' | sh
Maybe do a testrun without the | sh first.
You can also use
shasum * | sed -E 's/^([^ ]+) +(.*\.)([^.]*)$/mv -- "\2\3" "\2\1.\3"/' | sh
Details:
-E - enables the POSIX ERE syntax
s - substitution command
^([^ ]+) +(.*\.)([^.]*)$ - finds
^ - start of string
([^ ]+) - one or more chars other than space
+ - one or more spaces
(.*\.) - Group 1: any zero or more chars and then a .
([^.]*) - Group 2: any zero or more chars other than a .
$ - end of string
mv -- "\2\3" "\2\1.\3" - replaces the match with mv -- ", then Group 2 and Group 3 values concatenated, " ", Group 2 + Group 1, . and Group 3 values.

How to extract text between first 2 dashes in the string using sed or grep in shell

I have the string like this feature/test-111-test-test.
I need to extract string till the second dash and change forward slash to dash as well.
I have to do it in Makefile using shell syntax and there for me doesn't work some regular expression which can help or this case
Finally I have to get smth like this:
input - feature/test-111-test-test
output - feature-test-111- or at least feature-test-111
feature/test-111-test-test | grep -oP '\A(?:[^-]++-??){2}' | sed -e 's/\//-/g')
But grep -oP doesn't work in my case. This regexp doesn't work as well - (.*?-.*?)-.*.
Another sed solution using a capture group and regex/pattern iteration (same thing Socowi used):
$ s='feature/test-111-test-test'
$ sed -E 's/\//-/;s/^(([^-]*-){3}).*$/\1/' <<< "${s}"
feature-test-111-
Where:
-E - enable extended regex support
s/\//-/ - replace / with -
s/^....*$/ - match start and end of input line
(([^-]-){3}) - capture group #1 that consists of 3 sets of anything not - followed by -
\1 - print just the capture group #1 (this will discard everything else on the line that's not part of the capture group)
To store the result in a variable:
$ url=$(sed -E 's/\//-/;s/^(([^-]*-){3}).*$/\1/' <<< "${s}")
$ echo $url
feature-test-111-
You can use awk keeping in mind that in Makefile the $ char in awk command must be doubled:
url=$(shell echo 'feature/test-111-test-test' | awk -F'-' '{gsub(/\//, "-", $$1);print $$1"-"$$2"-"}')
echo "$url"
# => feature-test-111-
See the online demo. Here, -F'-' sets the field delimiter as -, gsub(/\//, "-", $1) replaces / with - in Field 1 and print $1"-"$2"-" prints the value of --separated Field 1 and 2.
Or, with a regex as a field delimiter:
url=$(shell echo 'feature/test-111-test-test' | awk -F'[-/]' '{print $$1"-"$$2"-"$$3"-"}')
echo "$url"
# => feature-test-111-
The -F'[-/]' option sets the field separator to - and /.
The '{print $1"-"$2"-"$3"-"}' part prints the first, second and third value with a separating hyphen.
See the online demo.
To get the nth occurrence of a character C you don't need fancy perl regexes. Instead, build a regex of the form "(anything that isn't C, then C) for n times":
grep -Eo '([^-]*-){2}' | tr / -
With sed and cut
echo feature/test-111-test-test| cut -d'-' -f-2 |sed 's/\//-/'
Output
feature-test-111
echo feature/test-111-test-test| cut -d'-' -f-2 |sed 's/\//-/;s/$/-/'
Output
feature-test-111-
You can use the simple BRE regex form of not something then that something which is [^-]*- to get all characters other than - up to a -.
This works:
echo 'feature/test-111-test-test' | sed -nE 's/^([^/]*)\/([^-]*-[^-]*-).*/\1-\2/p'
feature-test-111-
Another idea using parameter expansions/substitutions:
s='feature/test-111-test-test'
tail="${s//\//-}" # replace '/' with '-'
# split first field from rest of fields ('-' delimited); do this 3x times
head="${tail%%-*}" # pull first field
tail="${tail#*-}" # drop first field
head="${head}-${tail%%-*}" # pull first field; append to previous field
tail="${tail#*-}" # drop first field
head="${head}-${tail%%-*}-" # pull first field; append to previous fields; add trailing '-'
$ echo "${head}"
feature-test-111-
A short sed solution, without extended regular expressions:
sed 's|\(.*\)/\([^-]*-[^-]*\).*|\1-\2|'

Awk/ Perl regular expression to match space with hyphen

I have the following 2 lines in sample.txt
AIA - 1000
AIA Integrations for E-Business Suite - 5544
Now i want to see the following output:
Column1 | Column2
AIA 1000
AIA Integrations for E-Business Suite 5544
i tried:
awk -F "-" sample.txt
It filters the hyphen "-" near "E-Business Suite"
How to make it filter the last hyphen instead of the intermediate ones.
You can use:
awk -F ' - ' -v OFS=';' 'BEGIN{print "Column1", "Column2"} {print $1, $2}' file |
column -s ';' -t
Column1 Column2
AIA 1000
AIA Integrations for E-Business Suite 5544
-F ' - ' uses " - " is input field separator
-v OFS=';' uses ; as output field separator
column -s ';' -t formats data in tabular format using ; as delimiter
Another example, using split and join:
perl -F- -e 'print join "\t", reverse pop #F, join "-", #F' sample.txt
I would use perl to guarentee that we are truly catching the last - as the separator not some other instance of it in the middle of the first field:
perl -wnle '/^(.+) - (.+)$/ or die; print "$1\t$2"' sample.txt
If you want the output to be in fixed width columns, you can use column:
perl -wnle '/^(.+) - (.+)$/ or die; print "$1\t$2"' sample.txt | column -s $'\t' -t
Explanation: The first (.+) in the regex will capture the first group. Because + is greedy, ^(.+) - it match with the largest possible substring, so that if there are multiple instances of -, it will include all of them but the last one in the first capture group. Then the last (.+) will capture all the remaining characters in the second capture group.

sed command to replace multiple spaces into single spaces

I tried to replace multiple spaces in a file to single space using sed.
But it splits each and every character like below.
Please let me know what the problem is ...
$ cat test.txt
iiHi Hello Hi
this is loga
$
$ cat test.txt | tr [A-Z] [a-z]|sed -e "s/ */ /g"
i i h i h e l l o h i
t h i s i s l o g a
Your sed command does the wrong thing because it's matching on "zero or more spaces" which of course happens between each pair of characters! Instead of s/ */ /g you want s/ */ /g or s/ +/ /g.
Using tr, the -s option will squeeze consecutive chars to a single one:
tr -s '[:space:]' < test.txt
iiHi Hello Hi
this is loga
To downcase as well: tr -s '[:space:]' < test.txt | tr '[:upper:]' '[:lower:]'
sed 's/ \+/ /g' test.txt | tr [A-Z] [a-z]
or
sed 's/\s\+/ /g' test.txt | tr [A-Z] [a-z]
Good grief that was terse, because * matches zero or more it inserted a space after every character, you want + which matches one or more. Also I switched the order because in doing so you don't have to cat the file.
You can use awk to solve this:
awk '{$0=tolower($0);$1=$1}1' test.txt
iihi hello hi
this is loga
Maybe you can match the following regex for multiple spaces:
'\s+'
and replace with just one space as follows:
' '