I have working example of substitution in online regex tester https://regex101.com/r/3FKdLL/1 and I want to use it as a substitution in sed editor.
echo "repo-2019-12-31-14-30-11.gz" | sed -r 's/^([\w-]+)-\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}.gz$.*/\1/p'
It always prints whole string: repo-2019-12-31-14-30-11.gz, but not matched group [\w-]+.
I expect to get only text from group which is repo string in this example.
Try this:
echo "repo-2019-12-31-14-30-11.gz" |
sed -rn 's/^([A-Za-z]+)-[[:alnum:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}-[[:digit:]]{2}-[[:digit:]]{2}-[[:digit:]]{2}.gz.*$/\1/p'
Explanations:
\w will work (not [\w] wich matches either backslash or w), but you should use [[:alnum:]] which is POSIX
For sed, \d isn't a regex class, but an escaped character representing a non-printable character
Add -n to mute sed, with /p to explicitly print matched lines
Additionaly, you could refactor your regex by removing duplication:
echo "repo-2019-12-31-14-30-11.gz" |
sed -rn 's/^([[:alnum:]]+)-[[:digit:]]{4}(-[[:digit:]]{2}){5}.gz.*$/\1/p'
Looks like a job for GNU grep :
echo "repo-2019-12-31-14-30-11.gz" | grep -oP '^\K[[:alpha:]-]+'
Displays :
repo-
On this example :
echo "repo-repo-2019-12-31-14-30-11.gz" | grep -oP '^\K[[:alpha:]-]+'
Displays :
repo-repo-
Which I think is what you want because you tried with [\w-]+ on your regex.
If I'm wrong, just replace the grep command with : grep -oP '^\K\w+'
I'm am trying to replace a series of asterix symbols in a text file with a -999.9 using sed. However I can't figure out how to properly escape the wildcard symbol.
e.g.
$ echo "2006.0,1.0,************,-5.0" | sed 's/************/-999.9/g'
sed: 1: "s/************/-999.9/g": RE error: repetition-operator operand invalid
Doesn't work. And
$ echo "2006.0,1.0,************,-5.0" | sed 's/[************]/-999.9/g'
2006.0,1.0,-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9,-5.0
puts a -999.9 for every * which isn't what I intended either.
Thanks!
Use this:
echo "2006.0,1.0,************,-5.0" | sed 's/[*]\+/-999.9/g'
Test:
$ echo "2006.0,1.0,************,-5.0" | sed 's/[*]\+/-999.9/g'
2006.0,1.0,-999.9,-5.0
Any of these (and more) is a regexp that will modify that line as you want:
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\**/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\+/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*+/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\{12\}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*{12}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\{1,\}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*{1,}/999.9/g'
2006.0,1.0,999.9,-5.0
sed operates on regular expressions, not strings, so you need to learn regular expression syntax if you're going to use sed and in particular the difference between BREs (which sed uses by default) and EREs (which some seds can be told to use instead) and PCREs (which sed never uses but some other tools and "regexp checkers" do). Only the first solution above is a BRE that will work on all seds on all platforms. Google is your friend.
* is a regex symbol that needs to be escaped.
You can even use BASH string replacement:
s="2006.0,1.0,************,-5.0"
echo "${s/\**,/-999.9,}"
2006.0,1.0,-999.9,-5.0
Using sed:
sed 's/\*\+/999.9/g' <<< "$s"
2006.0,1.0,999.9,-5.0
Ya, * are special meta character which repeats the previous token zero or more times. Escape * in-order to match literal * characters.
sed 's/\*\*\*\*\*\*\*\*\*\*\*\*/-999.9/g'
When this possibility was introduced into gawk I have no idea!
gawk -F, '{sub(/************/,"-999.9",$3)}1' OFS=, file
2006.0,1.0,-999.9,-5.0
I have an input string in the following format:
bugfix/ABC-12345-1-00
I want to extract "ABC-12345". Regex for that format in C# looks like this:
.\*\\/([A-Z]+-[0-9]+).\*
How can I do that in a bash script? I've tried sed and awk but had no success because I need to extract value from the capturing group and skip the rest.
If your grep supports -P then you could use the below grep commands.
$ echo 'bugfix/ABC-12345-1-00' | grep -oP '/\K[A-Z]+-\d+'
ABC-12345
\K keeps the text matched so far out of the overall regex match.
$ echo 'bugfix/ABC-12345-1-00' | grep -oP '(?<=/)[A-Z]+-\d+'
ABC-12345
(?<=/) Positive lookbehind which asserts that the match must be preceded by a / symbol.
Through sed,
$ echo 'bugfix/ABC-12345-1-00' | sed 's~.*/\([A-Z]\+-[0-9]\+\).*~\1~'
ABC-12345
echo "bugfix/ABC-12345-1-00"| perl -ane '/.*?([A-Z]+\-[0-9]+).*/;print $1."\n"'
You could try something like:
echo "bugfix/ABC-12345-1-00" | egrep -o '[A-Z]+-[0-9]+'
OUTPUT:
ABC-12345
If you do not like to use regex, you can use this awk:
echo "bugfix/ABC-12345-1-00" | awk -F\/ '{print $NF}'
ABC-12345-1-00
Or just this:
awk -F\/ '$0=$NF'
If I have a string that contains this somewhere (Foo could be anything):
<tag>Foo</tag>
How would I, using SED and RegEx, replace it with this:
[tag]Foo[/tag]
My failed attempt:
echo "<tag>Foo</tag>" | sed "s/<tag>\(.*\)<\\/tag>/[tag]\1[\\/tag]"
Your regex is missing the terminating /
$ echo "<tag>Foo</tag>" | sed "s/<tag>\(.*\)<\\/tag>/[tag]\1[\\/tag]/"
[tag]Foo[/tag]
With this you can replace all types of tags and don't have to be tag specific.
$echo "<tag>Foo</tag>" | sed "s/[^<]*<\([^>]*\)>\([^<]*\)<\([^>]*\)>/[\1]\2[\3]/"
hope this helps.
$ echo "Anirudh Tomer" | sed 's/ +/ /g'
Anirudh Tomer
I was expecting it to remove those 3 spaces between Anirudh and Tomer and give me result as "Anirudh Tomer"
I am a beginner.
Thanks in advance for the help.
You need to enable sed's extended regexp support with the -r flag.
echo "Anirudh Tomer" | sed -r 's/ +/ /g'
In extended regular expressions, the ?, + and | metacharacters must not be escaped (see wikipedia). The * metacharacter works because it belongs to the basic regular expressions.
Similar to VIM regex, you need to escape the + quantifier with a backslash:
sed 's/ \+/ /g'
echo "Anirudh Tomer" | tr -s ' '