How to use alternation operator in Vim + RipGrep? - regex

I have the following in my .vimrc which (I believe) makes :grep within Vim use rg:
if executable('rg')
set grepprg=rg\ --no-heading\ --vimgrep\ --hidden\ --case-sensitive\ --ignore-vcs\ --glob\ '!.git'\ --glob\ '!node_modules'
endif
I want to search for all definitions of functions named render.... If I do
rg -e \(const\|let\)\ render .
on the command line, I get what I'm looking for.
But
:grep -e \(const\|let\)\ render
in vim results in
zsh:1: command not found: let) render
regex parse error:
(const
^
error: unclosed group
I've tried some other combos of \, putting the whole query in /.../, can't quite get it working.
How do I use the alternation operator in ripgrep in vim?

There are three pieces of machinery involved, here, each with its own idiosyncrasies: Vim, your shell, and RipGrep.
Ideally, this is how your pattern should look with RipGrep's syntax:
(let|const) render
If you try it as-is:
:grep (let|const) render
you should get a cascade of messages (irrelevant lines removed):
:!rg (let 2>&1| tee /var/folders/q4/8ckdmdb136z10l1nh7ss_hsw0000gn/T/vphU3gH/26
/opt/local/bin/bash: -c: line 1: syntax error near unexpected token `let'
/opt/local/bin/bash: -c: line 1: `rg (let 2>&1| tee /var/folders/q4/8ckdmdb136z10l1nh7ss_hsw0000gn/T/vphU3gH/26'
shell returned 2
E40: Can't open errorfile
/var/folders/q4/8ckdmdb136z10l1nh7ss_hsw0000gn/T/vphU3gH/26
Vim
The first line:
:!rg (let 2>&1| tee /var/folders/q4/8ckdmdb136z10l1nh7ss_hsw0000gn/T/vphU3gH/26
^^^^^^^
tells you that the command executed under the hood is:
rg (let
which is obviously incomplete. That is because Vim thinks that the | is a command separator (:help :bar) so it tries to execute the broken :grep (let. If you want your | to pass through, you must escape it:
:grep (let\|const) render
OK, all the arguments are now passed to rg:
:!rg (let|const) render 2>&1| tee /var/folders/q4/8ckdmdb136z10l1nh7ss_hsw0000gn/T/vphU3gH/27
^^^^^^^^^^^^^^^^^^^^^
Your shell
You are not done yet, though:
/opt/local/bin/bash: -c: line 1: syntax error near unexpected token `let'
/opt/local/bin/bash: -c: line 1: `rg (let|const) render 2>&1| tee /var/folders/q4/8ckdmdb136z10l1nh7ss_hsw0000gn/T/vphU3gH/27'
shell returned 2
E40: Can't open errorfile /var/folders/q4/8ckdmdb136z10l1nh7ss_hsw0000gn/T/vphU3gH/27
Your pattern includes a capture group delimited with parentheses, which confuses the hell out of your shell because it looks like an attempt to execute the command let|const in a subshell, which is bound to fail anyway, but in a context where it can't be done.
You can try to solve those problems by escaping things with backslashes but you are entering an escaping arms race between the shell and Vim. That is the kind of race where there is no winner.
A better approach is to wrap your whole pattern with single quotes:
:grep '(let\|const) render'
which tells your shell to treat what is between the quotes literally, without trying to be smart.
You can check what arguments are passed to rg by forcing an error:
:grep '(let\|const) render' foobar
which should show you this:
:!rg '(let|const) render' foo 2>&1| tee /var/folders/q4/8ckdmdb136z10l1nh7ss_hsw0000gn/T/vphU3gH/29
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Well done!
RipGrep
Without the single quotes, RipGrep wouldn't know that render is part of the pattern so it treats it as a filename and you get errors because that filename doesn't exist.
Wrapping the pattern in single quotes killed two birds with one stone: your shell expansion issue is solved and RipGrep knows where your pattern ends.
NOTE: While it is inconsequential, here, the -e flag is not necessary because your pattern doesn't start with a -.

Related

Why am I getting the error: Unmatched ( in regex; marked by <-- HERE?

I am having trouble figuring out why am I getting the error defined in the title.
This is the line of code I'm inputting into the command line:
perl -pi -e 's/(\/(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)' myfilepath
Basically, what I'm trying to do is go through a body of text, find all the URLS and append something to the end of the domain. For example:
https://thisisalink.com/navigate/page <-- I want to ignore the ]
I keep getting this error when I run that code though:
Unmatched [ in regex; marked by <-- HERE in *)|[ <-- HERE A-Z0-9+&##%=~_|5.030003)/gxi)/ at -e line 1, <> line 1.
How to fix this issue?
$] is a special variable that contains the current version of the Perl interpreter used. Hence, [A-Z0 9+&##%=~_|$] is interpolated as [A-Z0 9+&##%=~_|5.032001 (on my Perl 5.32.1), and the opening [ is thus unmatched. To fix this, escape the $ using \$:
[A-Z0 9+&##%=~_|\$]
Similarly, earlier in the regex, you are using [...$?...], except that $? is also a special variable containing The status returned by the last pipe close, backtick (``) command, successful call to wait() or waitpid(), or from the system() operator. This does not cause any error since it should be an integer, but it will no match either $ or ? as you'd like. Once again, escape the $ using \$?.
In general, when you want to match a literal $, you should probably escape it.

Multiple vim regex command in one line

I want to run ,multiple regex commands in pipe on gvim.
for example:
s/,/;
s/\v\[(\d+):0\]/\=submatch(1)+1/g
how can i implement it in one line? does gvim support two regex commands in pipe?
i tried to run:
s/,/; | s/\v\[(\d+):0\]/\=submatch(1)+1/g
however it doesn't work for me.
hope for help
thanks :)
does gvim support two regex commands in pipe?
"Bars" are not about "regexes". They are about individual commands (see :h :bar for a complete list; also you may want to read :h cmdline-lines in full). But it actually works for :s, as per Vim's help: "Note that this is confusing (inherited from Vi): With ":g" the '|' is included in the command, with ":s" it is not."
however it doesn't work for me
That's because you must close the first regex before starting the second command: :s/,/;/ | ...
But in general, if you need to have "a bar" after a command which forcefully treats it as an argument, you can quote it with :h :execute, like this: execute 'cmd1' | cmd2. Beware of extra quoting single-quotes though.
Multiple command in one line You have to do this:
:command1 | :command2 | :command3 | and more...
for example
:retab | :%s/match/replace/g | :let tmp = 'something text' | :echo tmp
Don't forget the colon before the command

Finding strings across lines and replace with nothing

I have some 'fastq' format DNA sequence files (basically just text files) like this:
#Sample_1
ACTGACTGACTGACTGACTGACTGACTG
ACTGACTGACTGACTGACTGACTGACTG
+
BBBBBBBBBBBBEEEEEEEEEEEEEEEE
EHHHHKKKKKKKKKKKKKKNQQTTTTTT
#
+
#
+
#Sample_4
ACTGACTGACTGACTGACTGACTGACTG
ACTGACTGACTGACTGACTGACTGACTG
+
BBBBBBBBBBBBEEEEEEEEEEEEEEEE
EHHHHKKKKKKKKKKKKKKNQQTTTTTT
My ultimate goal is to turn these into 'fasta' format files, but to do that I need to get rid of the two empty sequences in the middle.
EDIT
The desired output would look like this:
#Sample_1
ACTGACTGACTGACTGACTGACTGACTG
ACTGACTGACTGACTGACTGACTGACTG
+
BBBBBBBBBBBBEEEEEEEEEEEEEEEE
EHHHHKKKKKKKKKKKKKKNQQTTTTTT
#Sample_4
ACTGACTGACTGACTGACTGACTGACTG
ACTGACTGACTGACTGACTGACTGACTG
+
BBBBBBBBBBBBEEEEEEEEEEEEEEEE
EHHHHKKKKKKKKKKKKKKNQQTTTTTT
All of the dedicated software I tried (Biopython, stand alone programs, perl scripts posted by others) crash at the empty sequences. This is really just a problem of searching for the string #\n+ and replacing it with nothing. I googled this and read several posts and tried about a million options with sed and couldn't figure it out. Here are some things that didn't work:
sed s/'#'/,/'+'// test.fastq > test.fasta
sed s/'#,+'// test.fastq > test.fasta
Any insights would be greatly appreciated.
PS. I've got a Mac.
Try:
sed "/^[#+]*$/d" test.fastq > test.fasta
The /d option tells sed to "delete" the matching line (i.e. not print it).
^ and $ mean "start of string" and "end of string" respectively, i.e. the line must be an exact match.
So, the above command basically says:
Print all lines that do not only contain # or +, and write the result to test.fasta.
Edit: I misunderstood the question slightly, sorry. If you want to only remove pairs of consecutive lines like
#
+
then you need to perform a multi-line search and replace.
Although this can be done with sed, it's perhaps easier to use something like a perl script instead:
perl -0pe 's/^#\n\+\n//gm' test.fastq > test.fasta
The -0 option turns Perl into "file slurp" mode, where Perl reads the entire input file in one shot (instead of line by line). This enables multi-line search and replace.
The -pe option allows you to run Perl code (pattern matching and replacement in this case) and display output from the command line.
^#\n\+\n is the pattern to match, which we are replacing with nothing (i.e. deleting).
/gm makes the substitution multiline and global.
You could also instead pass -i as the first parameter to perl, to edit the file inline.
This may not be the most elegant solution in the world, but you can use tr to replace the \n with a null character and back.
cat test.fastq | tr '\n' '\0' | sed 's/#\x0+\x0//g' | tr '\0' '\n' > test.fasta
Try this:
sed '/^#$/{N;/\n+$/d}' file
When # is found, next line is appended to the pattern space with N.
If $ is found in next line, the d command deletes both lines.

I need to use sed to comment out two lines in a text file

I am running a custom kernel build and have created a custom config file in a bash script, now I need to comment out two lines in Kbuild in order to prevent the bc compiler from running. The lines are...
$(obj)/$(timeconst-file): kernel/time/timeconst.bc FORCE
$(call filechk,gentimeconst)
Using Expresso, I have a regex that matches the first line...
^\$\(obj\)\/\$\(timeconst-file\): kernel\/time\/timeconst\.bc FORCE
Regex Match
But can't get sed to actually insert a # in front of the line.
Any help would be much appreciated.
sed -i "/<Something that matches the lines to be replaced>/s/^#*/#/g"
This uses a regex to select lines you want to comment/<something>/, then substitutes /s/ the start of the string ^(plus any #*s already there, with #. So you can comment lines that are already commented no problem. the /g means continue after you found your first match, so you can do mass commenting.
I have a bash script that I can mass comment using the above as:
sed -i.bkp "/$1/s/^#\+\s*//g" $2
i.bkp makes a backup of the file named .bkp
Script is called ./comment.sh <match> <filename>
The match does not have to match the entire line, just enough to make it only hit lines you want.
You can use following sed for replacement:
sed 's,^\($(obj)/$(timeconst-file): kernel/time/timeconst.bc FORCE\),#\1,'
You don't need to escape ( ) or $, as in sed without -r it is treated as literal, for grouping \( \) is used.

sed randomized last digits using expression

I need to parse a file and randomized the last digits for a given string when the pattern is found.
I am able to perform the desired result when using a simple case but it fails for a more complex case.
I am wondering what is wrong with the second case.
This example here works.
echo 'AB111-1-13' | sed 's/\(AB111\)-\([0-9]*\)-\([0-9]*\)/echo \1-\2-$(echo \3*$RANDOM | bc )/ge'
But this one doesn't work.
echo '<http://name/link#AB111-1-13>' | sed 's/\(AB111\)-\([0-9]*\)-\([0-9]*\)/echo \1-\2-$(echo \3*$RANDOM | bc )/ge'
Any ideas?
EDIT
This is the error message when trying to run the second example.
sh: -c: line 0: syntax error near unexpected token newline'
sh: -c: line 0:'
The GNU sed e flag executes the pattern space as a shell command.
In your first example your pattern space starts as AB111-1-13 and becomes echo AB111-1-$(echo 13*$RANDOM | bc ) which is a valid shell command and gets executed. (I should point out that bc is entirely unnecessary here as the shell can perform integer arithmetic just fine by itself echo $((13 * RANDOM)).)
But in your second example you pattern space starts as <http://name/link#AB111-1-13> and becomes <http://name/link#echo AB111-1-$(echo 13*$RANDOM | bc )> which is very much not a valid shell command and so, presumably, you get a shell error (would have been good of you to include it in the question though) when it tries to get executed.
So don't use sed for this. Use something that can evaluate arbitrary expressions like awk or perl or python, etc.