#! /bin/bash
NAME='joe'
FILE="joewhatever#gmail.com\nwhatever#joe.com\nwhateverjoe#gmail.com"
echo -e "$FILE" | grep "${NAME}*#"
I'm getting:
whateverjoe#gmail.com
And I expect to get:
joewhatever#gmail.com
whateverjoe#gmail.com
And let out:
whatever#joe.com
A dot . is absent before the star *
#!/bin/bash
NAME='joe'
FILE="joewhatever#gmail.com\nwhatever#joe.com\nwhateverjoe#gmail.com"
echo -e "$FILE" | grep "${NAME}.*#"
joewhatever#gmail.com
whateverjoe#gmail.com
In regular expressions, * is a quantifier that means zero or more repetitions of the preceding pattern. After variable substitution, your regexp is joe*#. This matches jo followed by zero or more e followed by #. joewhatever#gmail.com doesn't match that pattern, since it has whatever between joe and #.
You want joe.*# as the regexp -- . matches any character, so .* means to match any number of them. So it should be "${NAME}.*#"
echo -e "$FILE" | grep "${NAME}.*#"
I'm am trying to replace a series of asterix symbols in a text file with a -999.9 using sed. However I can't figure out how to properly escape the wildcard symbol.
e.g.
$ echo "2006.0,1.0,************,-5.0" | sed 's/************/-999.9/g'
sed: 1: "s/************/-999.9/g": RE error: repetition-operator operand invalid
Doesn't work. And
$ echo "2006.0,1.0,************,-5.0" | sed 's/[************]/-999.9/g'
2006.0,1.0,-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9,-5.0
puts a -999.9 for every * which isn't what I intended either.
Thanks!
Use this:
echo "2006.0,1.0,************,-5.0" | sed 's/[*]\+/-999.9/g'
Test:
$ echo "2006.0,1.0,************,-5.0" | sed 's/[*]\+/-999.9/g'
2006.0,1.0,-999.9,-5.0
Any of these (and more) is a regexp that will modify that line as you want:
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\**/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\+/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*+/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\{12\}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*{12}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\{1,\}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*{1,}/999.9/g'
2006.0,1.0,999.9,-5.0
sed operates on regular expressions, not strings, so you need to learn regular expression syntax if you're going to use sed and in particular the difference between BREs (which sed uses by default) and EREs (which some seds can be told to use instead) and PCREs (which sed never uses but some other tools and "regexp checkers" do). Only the first solution above is a BRE that will work on all seds on all platforms. Google is your friend.
* is a regex symbol that needs to be escaped.
You can even use BASH string replacement:
s="2006.0,1.0,************,-5.0"
echo "${s/\**,/-999.9,}"
2006.0,1.0,-999.9,-5.0
Using sed:
sed 's/\*\+/999.9/g' <<< "$s"
2006.0,1.0,999.9,-5.0
Ya, * are special meta character which repeats the previous token zero or more times. Escape * in-order to match literal * characters.
sed 's/\*\*\*\*\*\*\*\*\*\*\*\*/-999.9/g'
When this possibility was introduced into gawk I have no idea!
gawk -F, '{sub(/************/,"-999.9",$3)}1' OFS=, file
2006.0,1.0,-999.9,-5.0
I can't understand why the regexp:
[^\d\s\w,]
Matches the string:
"leonardo,davinci"
That is my test:
$ echo "leonardo,davinci" | egrep '[^\d\w\s,]'
leonardo,davinci
While this works as expected:
$ echo "leonardo,davinci" | egrep '[\S\W\D]'
$
Thanks very much
It's because egrep doesn't have the predefined sets \d, \w, \s. Therefore, putting slash in front of them is just matching them literally:
leonardo,davinci
echo "leonardo,davinci" | egrep '[^a-zA-Z0-9 ,]'
Will indeed, not match.
If you have it installed, you can use pcregrep instead:
echo "leonardo,davinci" | pcregrep '[^\w\s,]'
Please explain to me why the following expression doesn't output anything:
echo "<firstname.lastname#domain.com>" | egrep "<lastname#domain.com>"
but the following does:
echo "<firstname.lastname#domain.com>" | egrep "\<lastname#domain.com>"
The behaviour of the first is as expected but the second should not output. Is the "\<" being ignored within the regex or causing some other special behaviour?
AS #hwnd said \< matches the begining of the word. ie a word boundary \b must exists before the starting word character(character after \< in the input must be a word character),
In your example,
echo "<firstname.lastname#domain.com>" | egrep "<lastname#domain.com>"
In the above example, egrep checks for a literal < character present before the lastname string. But there isn't, so it prints nothing.
$ echo "<firstname.lastname#domain.com>" | egrep "\<lastname#domain.com>"
<firstname.**lastname#domain.com>**
But in this example, a word boundary \b exists before lastname string so it prints the matched characters.
Some more examples:
$ echo "namelastname#domain.com" | egrep "\<e#domain.com"
$ echo "namelastname#domain.com" | egrep "\<lastname#domain.com"
$ echo "namelastname#domain.com" | egrep "\<com"
namelastname#domain.**com**
$ echo "<firstname.lastname#domain.com>" | egrep "\<#domain.com>"
$ echo "n-ame-lastname#domain.com" | egrep "\<ame-lastname#domain.com"
n-**ame-lastname#domain.com**
current I have a set strings that are of the format
customName(path/to/the/relevant/directory|file.ext#FileRefrence_12345)
From this I could like to extract customName, the characters before the first parentheses, using sed.
My best guesses so far are:
echo $s | sed 's/([^(])+\(.*\)/\1/g'
echo $s | sed 's/([^\(])+\(.*\)/\1/g'
However, using these I get the error:
sed: -e expression #1, char 21: Unmatched ( or \(
So how do I form the correct regular expression? and why is it relevant that I do not have a matched \( is it is just an escaped character for my expression, not a character used for formatting?
you could substitute everything after the opening parenthesis, like this (note that parentheses by default do not need to be escaped in sed)
echo 'customName(path/to/the/relevant/directory|file.ext#FileRefrence_12345)' |
sed -e 's/(.*//'
grep
kent$ echo "customName(blah)"|grep -o '^[^(]*'
customName
sed
kent$ echo "customName(blah)"|sed 's/(.*//'
customName
note I changed the stuff between the brackets.
Different options:
$ echo $s | sed 's/(.*//' #sed (everything before "(")
customName
$ echo $s | cut -d"(" -f1 #cut (delimiter is "(", print 1st block)
customName
$ echo $s | awk -F"(" '{print $1}' #awk (field separator is "(", print 1st)
customName
$ echo ${s%(*} #bash command substitution
customName