matching a line with a literal asterisk "*" in grep

matching a line with a literal asterisk "*" in grep - regex

Tried
$ echo "$STRING" | egrep "(\*)"
and also
$ echo "$STRING" | egrep '(\*)'
and countless other variations. I just want to match a line that contains a literal asterisk anywhere in the line.

Try a character class instead
echo "$STRING" | egrep '[*]'

echo "$STRING" | fgrep '*'
fgrep is used to match the special characters.

Simply escape the asterisk with a backslash:
grep "\*"

Use:
grep "*" file.txt
or
cat file.txt | grep "*"

Here's one way to match a literal asterisk:
$ echo "*" | grep "[*]"
*
$ echo "*" | egrep "[*]"
*
$ echo "asfd" | egrep "[*]"
$ echo "asfd" | grep "[*]"
$
Wrapping an expression in brackets usually allows you to capture a single special character easily; this will also work for a right bracket or a hyphen, for instance.
Be careful when this isn't in a bracket grouping:
$ echo "hi" | egrep "*"
hi
$ echo "hi" | grep "*"
$

If there is a need to detect an asterisk in awk, you can either use
awk '/\*/' file
Here, * is used in a regex, and thus, must be escaped since an unescaped * is a quantifier that means "zero or more occurrences". Once it is escaped, it no longer has any special meaning.
Alternatively, if you do not need to check for anything else, it makes sense to peform a fixed string check:
awk 'index($0, "*")' file
If * is found anywhere inside a "record" (i.e. a line) the current line will get printed.

Related

Variable in grep regex

#! /bin/bash
NAME='joe'
FILE="joewhatever#gmail.com\nwhatever#joe.com\nwhateverjoe#gmail.com"
echo -e "$FILE" | grep "${NAME}*#"
I'm getting:
whateverjoe#gmail.com
And I expect to get:
joewhatever#gmail.com
whateverjoe#gmail.com
And let out:
whatever#joe.com

A dot . is absent before the star *
#!/bin/bash
NAME='joe'
FILE="joewhatever#gmail.com\nwhatever#joe.com\nwhateverjoe#gmail.com"
echo -e "$FILE" | grep "${NAME}.*#"
joewhatever#gmail.com
whateverjoe#gmail.com

In regular expressions, * is a quantifier that means zero or more repetitions of the preceding pattern. After variable substitution, your regexp is joe*#. This matches jo followed by zero or more e followed by #. joewhatever#gmail.com doesn't match that pattern, since it has whatever between joe and #.
You want joe.*# as the regexp -- . matches any character, so .* means to match any number of them. So it should be "${NAME}.*#"
echo -e "$FILE" | grep "${NAME}.*#"

Sed replace asterisk symbols

I'm am trying to replace a series of asterix symbols in a text file with a -999.9 using sed. However I can't figure out how to properly escape the wildcard symbol.
e.g.
$ echo "2006.0,1.0,************,-5.0" | sed 's/************/-999.9/g'
sed: 1: "s/************/-999.9/g": RE error: repetition-operator operand invalid
Doesn't work. And
$ echo "2006.0,1.0,************,-5.0" | sed 's/[************]/-999.9/g'
2006.0,1.0,-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9-999.9,-5.0
puts a -999.9 for every * which isn't what I intended either.
Thanks!

Use this:
echo "2006.0,1.0,************,-5.0" | sed 's/[*]\+/-999.9/g'
Test:
$ echo "2006.0,1.0,************,-5.0" | sed 's/[*]\+/-999.9/g'
2006.0,1.0,-999.9,-5.0

Any of these (and more) is a regexp that will modify that line as you want:
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\**/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\+/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*+/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\{12\}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*{12}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed 's/\*\{1,\}/999.9/g'
2006.0,1.0,999.9,-5.0
$ echo "2006.0,1.0,************,-5.0" | sed -r 's/\*{1,}/999.9/g'
2006.0,1.0,999.9,-5.0
sed operates on regular expressions, not strings, so you need to learn regular expression syntax if you're going to use sed and in particular the difference between BREs (which sed uses by default) and EREs (which some seds can be told to use instead) and PCREs (which sed never uses but some other tools and "regexp checkers" do). Only the first solution above is a BRE that will work on all seds on all platforms. Google is your friend.

* is a regex symbol that needs to be escaped.
You can even use BASH string replacement:
s="2006.0,1.0,************,-5.0"
echo "${s/\**,/-999.9,}"
2006.0,1.0,-999.9,-5.0
Using sed:
sed 's/\*\+/999.9/g' <<< "$s"
2006.0,1.0,999.9,-5.0

Ya, * are special meta character which repeats the previous token zero or more times. Escape * in-order to match literal * characters.
sed 's/\*\*\*\*\*\*\*\*\*\*\*\*/-999.9/g'

When this possibility was introduced into gawk I have no idea!
gawk -F, '{sub(/************/,"-999.9",$3)}1' OFS=, file
2006.0,1.0,-999.9,-5.0

Why [^\d\w\s,] matches "leonardo,davinci"?

I can't understand why the regexp:
[^\d\s\w,]
Matches the string:
"leonardo,davinci"
That is my test:
$ echo "leonardo,davinci" | egrep '[^\d\w\s,]'
leonardo,davinci
While this works as expected:
$ echo "leonardo,davinci" | egrep '[\S\W\D]'
$
Thanks very much

It's because egrep doesn't have the predefined sets \d, \w, \s. Therefore, putting slash in front of them is just matching them literally:
leonardo,davinci
echo "leonardo,davinci" | egrep '[^a-zA-Z0-9 ,]'
Will indeed, not match.
If you have it installed, you can use pcregrep instead:
echo "leonardo,davinci" | pcregrep '[^\w\s,]'

Regex behaviour with angle brackets

Please explain to me why the following expression doesn't output anything:
echo "<firstname.lastname#domain.com>" | egrep "<lastname#domain.com>"
but the following does:
echo "<firstname.lastname#domain.com>" | egrep "\<lastname#domain.com>"
The behaviour of the first is as expected but the second should not output. Is the "\<" being ignored within the regex or causing some other special behaviour?

AS #hwnd said \< matches the begining of the word. ie a word boundary \b must exists before the starting word character(character after \< in the input must be a word character),
In your example,
echo "<firstname.lastname#domain.com>" | egrep "<lastname#domain.com>"
In the above example, egrep checks for a literal < character present before the lastname string. But there isn't, so it prints nothing.
$ echo "<firstname.lastname#domain.com>" | egrep "\<lastname#domain.com>"
<firstname.**lastname#domain.com>**
But in this example, a word boundary \b exists before lastname string so it prints the matched characters.
Some more examples:
$ echo "namelastname#domain.com" | egrep "\<e#domain.com"
$ echo "namelastname#domain.com" | egrep "\<lastname#domain.com"
$ echo "namelastname#domain.com" | egrep "\<com"
namelastname#domain.**com**
$ echo "<firstname.lastname#domain.com>" | egrep "\<#domain.com>"
$ echo "n-ame-lastname#domain.com" | egrep "\<ame-lastname#domain.com"
n-**ame-lastname#domain.com**

How do I form the correct regular expression to capture everything before parentheses?

current I have a set strings that are of the format
customName(path/to/the/relevant/directory|file.ext#FileRefrence_12345)
From this I could like to extract customName, the characters before the first parentheses, using sed.
My best guesses so far are:
echo $s | sed 's/([^(])+\(.*\)/\1/g'
echo $s | sed 's/([^\(])+\(.*\)/\1/g'
However, using these I get the error:
sed: -e expression #1, char 21: Unmatched ( or \(
So how do I form the correct regular expression? and why is it relevant that I do not have a matched \( is it is just an escaped character for my expression, not a character used for formatting?

you could substitute everything after the opening parenthesis, like this (note that parentheses by default do not need to be escaped in sed)
echo 'customName(path/to/the/relevant/directory|file.ext#FileRefrence_12345)' |
sed -e 's/(.*//'

grep
kent$ echo "customName(blah)"|grep -o '^[^(]*'
customName
sed
kent$ echo "customName(blah)"|sed 's/(.*//'
customName
note I changed the stuff between the brackets.

Different options:
$ echo $s | sed 's/(.*//' #sed (everything before "(")
customName
$ echo $s | cut -d"(" -f1 #cut (delimiter is "(", print 1st block)
customName
$ echo $s | awk -F"(" '{print $1}' #awk (field separator is "(", print 1st)
customName
$ echo ${s%(*} #bash command substitution
customName

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

matching a line with a literal asterisk "*" in grep - regex

Tried $ echo "$STRING" | egrep "(\)" and also $ echo "$STRING" | egrep '(\)' and countless other variations. I just want to match a line that contains a literal asterisk anywhere in the line.

Try a character class instead echo "$STRING" | egrep '[*]'

echo "$STRING" | fgrep '*' fgrep is used to match the special characters.

Simply escape the asterisk with a backslash: grep "\*"

Use: grep "" file.txt or cat file.txt | grep ""

Related

Variable in grep regex

Sed replace asterisk symbols

Why [^\d\w\s,] matches "leonardo,davinci"?

Regex behaviour with angle brackets

How do I form the correct regular expression to capture everything before parentheses?

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

matching a line with a literal asterisk "*" in grep - regex

Tried $ echo "$STRING" | egrep "(\*)" and also $ echo "$STRING" | egrep '(\*)' and countless other variations. I just want to match a line that contains a literal asterisk anywhere in the line.

Try a character class instead echo "$STRING" | egrep '[*]'

echo "$STRING" | fgrep '*' fgrep is used to match the special characters.

Simply escape the asterisk with a backslash: grep "\*"

Use: grep "*" file.txt or cat file.txt | grep "*"

Related

Variable in grep regex

Sed replace asterisk symbols

Why [^\d\w\s,] matches "leonardo,davinci"?

Regex behaviour with angle brackets

How do I form the correct regular expression to capture everything before parentheses?

Categories

Resources

Tried $ echo "$STRING" | egrep "(\)" and also $ echo "$STRING" | egrep '(\)' and countless other variations. I just want to match a line that contains a literal asterisk anywhere in the line.

Use: grep "" file.txt or cat file.txt | grep ""