The pattern I'm looking for looks like $guid1$ with the $ signs on each side. Unfortunately, my regex in grep (and probably elsewhere) interprets that last $ as something else.
"\$guid[0-9]\$" works but "\$guid[0-9]\$" does not. What can I do?
You need to use single quotes around your regex:
grep '\$guid1\$' file
OR use fgrep for fixed string search:
fgrep '$guid1$' file
Related
I understand that what I'm asking can be accomplished using awk or sed, I'm asking here how to do this using GREP.
Given the following input:
.bash_profile
.config/ranger/bookmarks
.oh-my-zsh/README.md
I want to use GREP to get:
.bash_profile
.config/
.oh-my-zsh/
Currently I'm trying
grep -Po '([^/]*[/]?){1}'
Which results in output:
.bash_profile
.config/
ranger/
bookmarks
.oh-my-zsh/
README.md
Is there some simple way to use GREP to only get the first matched string on each line?
I think you can grep non / letters like:
grep -Eo '^[^/]+'
On another SO site there is another similar question with solution.
You don't need grep for this at all.
cut -d / -f 1
The -o option says to print every substring which matches your pattern, instead of printing each matching line. Your current pattern matches every string which doesn't contain slashes (optionally including a trailing slash); but it's easy to switch to one which only matches this pattern at the beginning of a line.
grep -o '^[^/]*' file
Notice the addition of the ^ beginning of line anchor, and the omission of the -P option (which you were not really using anyway) as well as the silly beginner error {1}.
(I should add that plain grep doesn't support parentheses or repetitions; grep -E would support these constructs just fine, of you could switch to toe POSIX BRE variation which requires a backslash to use round or curly parentheses as metacharacters. You can probably ignore these details and just use grep -E everywhere unless you really need the features of grep -P, though also be aware that -P is not portable.)
I'm trying but failing to write a regex to grep for lines that do not begin with "//" (i.e. C++-style comments). I'm aware of the "grep -v" option, but I am trying to learn how to pull this off with regex alone.
I've searched and found various answers on grepping for lines that don't begin with a character, and even one on how to grep for lines that don't begin with a string, but I'm unable to adapt those answers to my case, and I don't understand what my error is.
> cat bar.txt
hello
//world
> cat bar.txt | grep "(?!\/\/)"
-bash: !\/\/: event not found
I'm not sure what this "event not found" is about. One of the answers I found used paren-question mark-exclamation-string-paren, which I've done here, and which still fails.
> cat bar.txt | grep "^[^\/\/].+"
(no output)
Another answer I found used a caret within square brackets and explained that this syntax meant "search for the absence of what's in the square brackets (other than the caret). I think the ".+" means "one or more of anything", but I'm not sure if that's correct and if it is correct, what distinguishes it from ".*"
In a nutshell: how can I construct a regex to pass to grep to search for lines that do not begin with "//" ?
To be even more specific, I'm trying to search for lines that have "#include" that are not preceeded by "//".
Thank you.
The first line tells you that the problem is from bash (your shell). Bash finds the ! and attempts to inject into your command the last you entered that begins with \/\/. To avoid this you need to escape the ! or use single quotes. For an example of !, try !cat, it will execute the last command beginning with cat that you entered.
You don't need to escape /, it has no special meaning in regular expressions. You also don't need to write a complicated regular expression to invert a match. Rather, just supply the -v argument to grep. Most of the time simple is better. And you also don't need to cat the file to grep. Just give grep the file name. eg.
grep -v "^//" bar.txt | grep "#include"
If you're really hungup on using regular expressions then a simple one would look like (match start of string ^, any number of white space [[:space:]]*, exactly two backslashes /{2}, any number of any characters .*, followed by #include):
grep -E "^[[:space:]]*/{2}.*#include" bar.txt
You're using negative lookahead which is PCRE feature and requires -P option
Your negative lookahead won't work without start anchor
This will of course require gnu-grep.
You must use single quotes to use ! in your regex otherwise history expansion is attempted with the text after ! in your regex, the reason of !\/\/: event not found error.
So you can use:
grep -P '^(?!\h*//)' file
hello
\h matches 0 or more horizontal whitespace.
Without -P or non-gnu grep you can use grep -v:
grep -v '^[[:blank:]]*//' file
hello
To find #include lines that are not preceded by // (or /* …), you can use:
grep '^[[:space:]]*#[[:space:]]*include[[:space:]]*["<]'
The regex looks for start of line, optional spaces, #, optional spaces, include, optional spaces and either " or <. It will find all #include lines except lines such as #include MACRO_NAME, which are legitimate but rare, and screwball cases such as:
#/*comment*/include/*comment*/<stdio.h>
#\
include\
<stdio.h>
If you have to deal with software containing such notations, (a) you have my sympathy and (b) fix the code to a more orthodox style before hunting the #include lines. It will pick up false positives such as:
/* Do not include this:
#include <does-not-exist.h>
*/
You could omit the final [[:space:]]*["<] with minimal chance of confusion, which will then pick up the macro name variant.
To find lines that do not start with a double slash, use -v (to invert the match) and '^//' to look for slashes at the start of a line:
grep -v '^//'
You have to use the -P (perl) option:
cat bar.txt | grep -P '(?!//)'
For the lines not beginning with "//", you could use (^[^/]{2}.*$).
If you don't like grep -v for this then you could just use awk:
awk '!/^\/\//' file
Since awk supports compound conditions instead of just regexps, it's often easier to specify what you want to match with awk than grep, e.g. to search for a and b in any order with grep:
grep -E 'a.*b|b.*a`
while with awk:
awk '/a/ && /b/'
I have a file path that I need to parse, however, I am pretty new to shell and am not really sure what the appropriate or conventional process would be. Let's say I have a variable representing a file path:
EX=/home/directory/this/might/have/numbers23/12349_2348/more/paths
I need to obtain 12349_2348, which will then become a file name for some other things that I already know how to do. How can I extract this? I know a basic way to do this using regex, which would match with /([0-9])\d+/, however, I determined that by playing around with regexr and have no idea what to do with it from there. I have tried using sed as follows:
echo $EX | sed /([0-9])\d+/
but this does not do anything and just gives me an error. What is a better way to do this, and if sed is the best way to do it, what am I doing wrong? I have looked at tutorials and it seems like I should be able to just match the regular expression this way.
It depends on how you know what you're looking for. So for example if you know it's some digits followed by underscore followed by some more digits, you could do this:
dwalker$ EX=/home/directory/this/might/have/numbers23/12349_2348/more/paths
dwalker$ echo $EX | egrep -o '\d+_\d+'
12349_2348
5 digits followed by underscore followed by 4 digits:
dwalker$ EX=/home/directory/this/might/have/numbers23/12349_2348/more/paths
dwalker$ echo $EX | egrep -o '\d{5}_\d{4}'
12349_2348
If you know you need to take off 2 subdirectories off the end, and then what remains is your directory, you can do this:
$ EX1=`dirname $EX`
$ EX1=`dirname $EX`
$ basename $EX1
12349_2348
So there are a couple of ways to do it.
egrep is "extended" grep. It lets you use \d for digits and other things. You can see the man page for more details, and for the explanation of -o.
I have such file:
blue|1|red|2
green|3|blue|4
darkblue|0|yellow|3
I want to use grep to find anything containg blue| at the beginning of line or |blue| anywhere, but not any darkblue| or |darkblue| or |blueberry|
I tried to use grep [^|\|]blue\| but Git Bash gives me error:
$ grep [^|\|]blue\| *.*
grep: Unmatched [ or [^
sh.exe": |]blue|: command not found
What did I do wrong? What's the proper way to do it?
Here's a quick & dirty one:
grep -E '(^|\|)blue\|' *
Matches start of line or |, followed by blue|. The important note is that you need extended regular expressions (via egrep or the -E flag) to use the | (or) construct.
Also, note the single quotes around the regular expression.
So, in answer to the OP's "What did I do wrong?",
You forgot to put the regexp in single quotes;
You chose the wrong type of brackets to enclose the alternate expressions; and finally
You forgot to use egrep or the -E flag
It's always easier to see other people's errors; I wish I was a quick to spot my own :-|
I'm trying to play around w/ a negative lookbehind regex, but I can't seem to get it to work in my zshell. Am I doing this wrong?
echo "Nate or nate" | grep "(\?<!N)a"
This should match the a in nate but NOT the a in Nate...right?
When I think of lookahead or lookbehind assertions, I think of Perl. You will need to use perl-regexp and single quotes to find the a in nate:
echo "Nate or nate" | grep -P '(?<!N)a'
It should. However, grep will print out any line with a match.
If you'd like grep to print out only the parts of the line it matches, you should give it the -o option.
There are a number of different regex flavours, but the regex for grep should probably look like this: "(?<!N)a".
First off you want to use single quotes (double quotes in zsh will try to expand the !N), you probably want extended regexen (grep -E). Also depending on your version of grep, it may not support 0-width assertions at all, check your man 7 re_format.