How to replace the last matching dot?
for example, I'd like to change test.jpg to test.th.jpg
what I've tried:
echo "test.jpg" | sed 's#[^\.]*$#\.th.#g'
This should work:
$ echo "test.jpg" | sed 's/\.\([^.]*\)$/.th.\1/'
It gives:
test.th.jpg
Explanation:
\. literal dot
\( start capturing group
[^.] any character except dot
* zero or more
\) close capturing group
$ end of line
In the replacement \1 replaces the content of the first capturing group :)
kent$ sed 's/[^.]*$/th.&/' <<<"test.jpg"
test.th.jpg
or
kent$ sed 's/.*\./&th./' <<<"test.jpg"
test.th.jpg
or
kent$ awk -F. -v OFS="." '$NF="th."$NF' <<< "test.jpg"
test.th.jpg
You can also use awk, prepend "th." to the last field
$ awk 'BEGIN{FS=OFS="."}$NF="th."$NF' <<< "test.jpg"
test.th.jpg
Using pure bash
$ str="test.abc.jpg"
$ echo ${str%.*}.th.jpg
test.abc.th.jpg
if you have Ruby on your system
echo test.jpg | ruby -e 'x=gets.split(".");x.insert(-2,"th"); puts x.join(".")'
Related
I have the following strings:
text/:some_random_text:text_i_w4nt_to:k33p.until_th3_end_1
text/:some_random_text:text_i_w4nt_to::k33p.until_th3_end_1
with a regular expression, I want to extract:
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
I have tried with regex101.com the following expression: ([^:]+)(?::[^:]+){1}$
and it worked (only for the first string)
But if I try in bash, it does not
echo "text/:some_random_text:text_i_w4nt_to::k33p.until_th3_end_1" | sed -n "/([^:]+)(?::[^:]+){1}$/p"
It would be much easier with cut without any regex:
cut -d: -f3- file
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
Non capture groups (?: are not supported in sed and you have to escape the \( \) \{ \} and \+
You can repeat 2 occurrences of : from the start of the string and replace that with an empty string.
sed 's/^\([^:]\+:\)\{2\}//' file
Or using sed -E for extended regexp:
sed -E 's/^([^:]+:){2}//' file
Output
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
Using sed
$ sed s'|\([^:]*:\)\{2\}\(.*\)$|\2|' input_file
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
or
$ sed s'|\([^:]*:\)\{2\}||' input_file
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
There's no reason to drag sed or other external programs into this; just use bash's built in regular expression matching:
#!/usr/bin/env bash
strings=(text/:some_random_text:text_i_w4nt_to:k33p.until_th3_end_1
text/:some_random_text:text_i_w4nt_to::k33p.until_th3_end_1)
for s in "${strings[#]}"; do
[[ $s =~ ^([^:]*:){2}(.*) ]] && printf "%s\n" "${BASH_REMATCH[2]}"
done
Heck, you don't need regular expressions in bash:
printf "%s\n" "${s#*:*:}"
awk
string='ext/:some_random_text:text_i_w4nt_to:k33p.until_th3_end_1
text/:some_random_text:text_i_w4nt_to::k33p.until_th3_end_1'
awk -vFS=: -vOFS=: '{$1=$2="";gsub(/^::/,"")}1' <<<"$string"
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
absolutely no need to use anything that requires regex-backreferences, since the regex anchoring is right at the line head anyway :
mawk ++NF OFS= FS='^[^:]*:[^:]*:'
text_i_w4nt_to:k33p.until_th3_end_1
text_i_w4nt_to::k33p.until_th3_end_1
there is a string a_b_c_d. I want to replace _ with - in the string between a_ and _d. Below is processing.
echo "a_b_c_d" | sed -E 's/(.+)_(.+)_(.+)/\1`s/_/-/g \2`\3/g'
But it does not work. how can I reuse the \2 to replace its content?
Perl allows to use code in replacement section with e modifier
$ echo 'a_b_c_d' | perl -pe 's/a_\K.*(?=_d)/$&=~tr|_|-|r/e'
a_b-c_d
$ echo 'x_a_b_c_y' | perl -pe 's/x_\K.*(?=_y)/$&=~tr|_|-|r/e'
x_a-b-c_y
$&=~tr|_|-|r here $& is the matched portion, and tr is applied on that to replace _ to -
a_\K this will match a_ but won't be part of matched portion
(?=_d) positive lookahead to match _d but won't be part of matched portion
With sed (tested on GNU sed 4.2.2, not sure of syntax for other versions)
$ echo 'a_b_c_d' | sed -E ':a s/(a_.*)_(.*_d)/\1-\2/; ta'
a_b-c_d
$ echo 'x_a_b_c_y' | sed -E ':a s/(x_.*)_(.*_y)/\1-\2/; ta'
x_a-b-c_y
:a label a
s/(a_.*)_(.*_d)/\1-\2/ substitute one _ with - between a_ and _d
ta go to label a as long as the substitution succeeds
gnu sed:
$ sed -r 's/_/-/g;s/(^[^-]+)-/\1_/;s/-([^-]+$)/_\1/' <<<'x_a_b_c_y'
x_a-b-c_y
The idea is, replacing all _ by -, then restoring the ones you want to keep.
update
if the fields separated by _ contains -, we can make use ge of gnu sed:
sed -r 's/(^[^_]+_)(.*)(_[^_]+$)/echo "\1"$(echo "\2"\|sed "s|_|-|g")"\3"/ge'
For example we want ----_f-o-o_b-a-r_---- to be ----_f-o-o-b-a-r_----:
sed -r 's/(^[^_]+_)(.*)(_[^_]+$)/echo "\1"$(echo "\2"\|sed "s|_|-|g")"\3"/ge' <<<'----_f-o-o_b-a-r_----'
----_f-o-o-b-a-r_----
Following Kent's suggestion, and if you do not need a general solution, this works:
$ echo 'a_b_c+d_x' | tr '_' '-' | sed -E 's/^([a-z]+)-(.+)-([a-z]+)$/\1_\2_\3/g'
$ a_b-c+d_x
The character classes should be adjusted to match the leading and trailing parts of your input string. Fails, of course, if a or x contain the '-' character.
How to use sed and regex to replace the text between a variable number of one token?
Example of input:
/abc/bcd/cde/
Expected output:
/../../../
Tried:
Command: echo "/abc/bcd/cde/" | sed 's/\/.*\//\/..\//g' output: /../
Using perl and look around assertions :
$ perl -pe 's|(?<=/)\w{3}(?=/)|..|g' file
/../../../
Using sed :
$ echo "/abc/bcd/cde/" | sed -E 's|[a-z]{3}|..|g'
/../../../
Replace every substring of non-slashes ([^/]\+) with two dots:
$> echo "/abc/bcd/cde/" | sed 's$[^/]\+$..$g'
# => /../../../
Base on #Gilles Quenot implementation but, capturing any alpha numeric chars between //
$ echo "/abddc/bcqsdd/cdde/" | sed -E 's|(/)?[^/]+/|\1../|g'
I have this text https://bitbucket.com/user/repo.git and I want to print repo, the content between / and .git, without including delimiters. I have this:
echo https://bitbucket.com/user/repo.git | grep -E -o '\/(.*?)\.git'
But it prints /repo.git. How can I print just repo?
Use the [^/]+(?=\.git$) pattern with -P option:
echo https://bitbucket.com/user/repo.git | grep -P -o '[^/]+(?=\.git$)'
See the online demo
The [^/]+(?=\.git$) pattern matches 1+ chars other than / that are followed with .git at the end of the string.
You can use sed to do that
echo https://bitbucket.com/user/repo.git | sed -e 's/^.\*\\/\\(.\*\\).git$/\1/g'
I have an input string in the following format:
bugfix/ABC-12345-1-00
I want to extract "ABC-12345". Regex for that format in C# looks like this:
.\*\\/([A-Z]+-[0-9]+).\*
How can I do that in a bash script? I've tried sed and awk but had no success because I need to extract value from the capturing group and skip the rest.
If your grep supports -P then you could use the below grep commands.
$ echo 'bugfix/ABC-12345-1-00' | grep -oP '/\K[A-Z]+-\d+'
ABC-12345
\K keeps the text matched so far out of the overall regex match.
$ echo 'bugfix/ABC-12345-1-00' | grep -oP '(?<=/)[A-Z]+-\d+'
ABC-12345
(?<=/) Positive lookbehind which asserts that the match must be preceded by a / symbol.
Through sed,
$ echo 'bugfix/ABC-12345-1-00' | sed 's~.*/\([A-Z]\+-[0-9]\+\).*~\1~'
ABC-12345
echo "bugfix/ABC-12345-1-00"| perl -ane '/.*?([A-Z]+\-[0-9]+).*/;print $1."\n"'
You could try something like:
echo "bugfix/ABC-12345-1-00" | egrep -o '[A-Z]+-[0-9]+'
OUTPUT:
ABC-12345
If you do not like to use regex, you can use this awk:
echo "bugfix/ABC-12345-1-00" | awk -F\/ '{print $NF}'
ABC-12345-1-00
Or just this:
awk -F\/ '$0=$NF'