Grep across the file system has no output in a shell script - regex

I'm trying to create a pre-commit hook in Git that will check for any debugging code and prompt the user to fix it. I have a regex that I'm grepping for (ignore the fact that it won't exclude occurrences in multiline comments!):
grep -IiRn --exclude-dir={node_modules,vendor,public,lib,contrib} --include=\*.{module,inc,install,php,js} -P '^\s*(?!\/\/)\s*(dpm\(|dsm\(|console.log\()' /path/to/code/
This works fine when I run it normally in the console, but when I try it in an executable .sh script it does nothing. None of the following has worked for me:
#!/bin/sh
grep ...
MYVAR =`grep ...` # Note the backticks!
echo $MYVAR
MYVAR =$(grep ...)
echo $MYVAR
MYVAR ="`grep ...`"
echo $MYVAR
I tried doing it with Python and os.system() but that did nothing either. It seems to just have no STDOUT. There's possibly something obvious I'm missing but I'm at a loose end.
Any help would be much appreciated! Thanks.
Edit:
This is the exact script, even though it's at the earliest possible stage due to not being able to actually do the first bit. I've hidden the exact folder names because it's probably best to not share my company's code base on SO ;)
#!/bin/bash
echo "Test!"
ONE=`grep -IiRn --exclude-dir={node_modules,vendor,public,lib,contrib} --include=\*.{module,inc,install,php,js} -P '^\s*(?!\/\/)\s*(dpm\(|dsm\(|console.log\()' /company/projects/company/www/sites/all/modules/custom/`
TWO=$(grep -IiRn --exclude-dir={node_modules,vendor,public,lib,contrib} --include=\*.{coffee} -P '^\s*(?!\#)\s*(dpm\(|dsm\(|console.log)' /company/projects/company/www/sites/all/modules/custom/)
echo $ONE
echo "$TWO"
... and running bash -x pre-commit returns:
ubuntu#ip-12-34-56-78:/company/projects/company/scripts$ bash -x pre-commit
+ echo 'Test!'
Test!
++ grep -IiRn --exclude-dir=node_modules --exclude-dir=vendor --exclude-dir=public --exclude-dir=lib --exclude-dir=contrib '--include=*.module' '--include=*.inc' '--include=*.install' '--include=*.php' '--include=*.js' -P '^\s*(?!\/\/)\s*(dpm\(|dsm\(|console.log\()' /company/projects/company/www/sites/all/modules/custom/
+ ONE='/company/projects/company/www/sites/all/modules/custom/some_module/some_module.report.inc:594: dsm('\''test'\'');
/company/projects/company/www/sites/all/modules/custom/goals_app/goals_app.module:170: console.log(e.stack);
/company/projects/company/www/sites/all/modules/custom/company_usage_reports/js/script.js:300: console.log('\''fetch success'\'');
/company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_change_workgroup.js:19: console.log('\''wtf?'\'');
/company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_reorder_table.js:33: console.log(resp);
/company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_reorder_table.js:39: console.log(ui.placeholder);
/company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_goal_form.js:4: console.log($( ".required" ));
/company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_reorder.js:40: console.log(resp);
/company/projects/company/www/sites/all/modules/custom/company_goals/js/views/goal-list.js:87: console.log(data);'
++ grep -IiRn --exclude-dir=node_modules --exclude-dir=vendor --exclude-dir=public --exclude-dir=lib --exclude-dir=contrib '--include=*.{coffee}' -P '^\s*(?!\#)\s*(dpm\(|dsm\(|console.log)' /company/projects/company/www/sites/all/modules/custom/
+ TWO=
+ echo /company/projects/company/www/sites/all/modules/custom/some_module/some_module.report.inc:594: 'dsm('\''test'\'');' /company/projects/company/www/sites/all/modules/custom/goals_app/goals_app.module:170: 'console.log(e.stack);' /company/projects/company/www/sites/all/modules/custom/company_usage_reports/js/script.js:300: 'console.log('\''fetch' 'success'\'');' /company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_change_workgroup.js:19: 'console.log('\''wtf?'\'');' /company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_reorder_table.js:33: 'console.log(resp);' /company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_reorder_table.js:39: 'console.log(ui.placeholder);' /company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_goal_form.js:4: 'console.log($(' '".required"' '));' /company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_reorder.js:40: 'console.log(resp);' /company/projects/company/www/sites/all/modules/custom/company_goals/js/views/goal-list.js:87: 'console.log(data);'
/company/projects/company/www/sites/all/modules/custom/some_module/some_module.report.inc:594: dsm('test'); /company/projects/company/www/sites/all/modules/custom/goals_app/goals_app.module:170: console.log(e.stack); /company/projects/company/www/sites/all/modules/custom/company_usage_reports/js/script.js:300: console.log('fetch success'); /company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_change_workgroup.js:19: console.log('wtf?'); /company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_reorder_table.js:33: console.log(resp); /company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_reorder_table.js:39: console.log(ui.placeholder); /company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_goal_form.js:4: console.log($( ".required" )); /company/projects/company/www/sites/all/modules/custom/another_module/js/another_module_reorder.js:40: console.log(resp); /company/projects/company/www/sites/all/modules/custom/company_goals/js/views/goal-list.js:87: console.log(data);
+ echo ''
... but running it without the -x flag STILL doesn't work.
Edit two:
In case anyone is wondering, my env is as follows...
ubuntu#ip-12-34-56-78:~$ uname -a
Linux ip-12-34-56-78 3.2.0-31-virtual #50-Ubuntu SMP Fri Sep 7 16:36:36 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
ubuntu#ip-12-34-56-78:~$ whereis sh && whereis bash
sh: /bin/sh /bin/sh.distrib /usr/share/man/man1/sh.1.gz
bash: /bin/bash /etc/bash.bashrc /usr/share/man/man1/bash.1.gz

I can't say for sure until you post the actual script you're running, but in your current code snippet have
#!/bin/sh
Depending on your OS, this may be a link to /bin/bash, for example, or it may be the actual Bourne shell, which does not support brace expansion (e.g. {a, b, c}). Even if /bin/sh does point to /bin/bash on your machine, you should only use portable constructs if your shebang is #!/bin/sh (i.e. say what you mean). If you want to use brace expansion in your script, change the shebang to #!/bin/bash.
If you put
set -x
at the top of your script, it will print detailed information that can help with debugging. You can also do this by invoking the shell directly instead of modifying your script, for example
sh -x /path/to/script
or
bash -x /path/to/script
EDIT: On Ubuntu, /bin/sh is dash, the Debian Almquist shell. Like the Bourne shell, dash is fairly restrictive, and does not support brace expansion. See this page for a discussion of portability issues and dash.

Related

/bin/sh: 1: Syntax error: "(" unexpected after running Makefile [duplicate]

I often find Bash syntax very helpful, e.g. process substitution like in diff <(sort file1) <(sort file2).
Is it possible to use such Bash commands in a Makefile? I'm thinking of something like this:
file-differences:
diff <(sort file1) <(sort file2) > $#
In my GNU Make 3.80 this will give an error since it uses the shell instead of bash to execute the commands.
From the GNU Make documentation,
5.3.2 Choosing the Shell
------------------------
The program used as the shell is taken from the variable `SHELL'. If
this variable is not set in your makefile, the program `/bin/sh' is
used as the shell.
So put SHELL := /bin/bash at the top of your makefile, and you should be good to go.
BTW: You can also do this for one target, at least for GNU Make. Each target can have its own variable assignments, like this:
all: a b
a:
#echo "a is $$0"
b: SHELL:=/bin/bash # HERE: this is setting the shell for b only
b:
#echo "b is $$0"
That'll print:
a is /bin/sh
b is /bin/bash
See "Target-specific Variable Values" in the documentation for more details. That line can go anywhere in the Makefile, it doesn't have to be immediately before the target.
You can call bash directly, use the -c flag:
bash -c "diff <(sort file1) <(sort file2) > $#"
Of course, you may not be able to redirect to the variable $#, but when I tried to do this, I got -bash: $#: ambiguous redirect as an error message, so you may want to look into that before you get too into this (though I'm using bash 3.2.something, so maybe yours works differently).
One way that also works is putting it this way in the first line of the your target:
your-target: $(eval SHELL:=/bin/bash)
#echo "here shell is $$0"
If portability is important you may not want to depend on a specific shell in your Makefile. Not all environments have bash available.
You can call bash directly within your Makefile instead of using the default shell:
bash -c "ls -al"
instead of:
ls -al
There is a way to do this without explicitly setting your SHELL variable to point to bash. This can be useful if you have many makefiles since SHELL isn't inherited by subsequent makefiles or taken from the environment. You also need to be sure that anyone who compiles your code configures their system this way.
If you run sudo dpkg-reconfigure dash and answer 'no' to the prompt, your system will not use dash as the default shell. It will then point to bash (at least in Ubuntu). Note that using dash as your system shell is a bit more efficient though.
It's not a direct answer to the question, makeit is limited Makefile replacement with bash syntax and it can be useful in some cases (I'm the author)
rules can be defined as bash-functions
auto-completion feature
Basic idea is to have while loop in the end of the script:
while [ $# != 0 ]; do
if [ "$(type -t $1)" == 'function' ]; then
$1
else
exit 1
fi
shift
done
https://asciinema.org/a/435159

What shell does std::system use?

TL;DR; I guess the shell that std::system use, is sh. But, I'm not sure.
I tried to print the shell, using this code: std::system("echo $SHELL"), and the output was /bin/bash. It was weird for me. So, I wanted to see, what happens if I do that in sh? And, the same output: /bin/bash. Also, if I use a command like SHELL="/usr/bin/something", to set the SHELL variable to another string, it will print the new string that I set to it (/usr/bin/something), and it looks it's not a good way to see what shell it's using. Then, I tried to check it, using the ps command, and the output was: bash, a.out, ps. It was weird to see bash in this list. So, I created a custom shell, and change the shell in gnome-terminal to it:
#include <iostream>
int main()
{
std::string input;
while (true)
{
std::string command;
std::getline(std::cin, command);
std::system(command.c_str());
}
}
Now, it's easier to test, and I think, the results is better.
Then, I tried to test the ps command again, but in the custom shell, and the results was: test_shell, ps.
It was weird again. How the shell isn't sh, nor bash? And, the final test I did was: echo $0. And, the results was sh, in both custom shell, and normal program.
Edit
It seems like /bin/sh is linked to /bin/bash (ll /bin/sh command's output is /bin/sh -> bash), and actually, it seems like the only difference between sh and bash is filename, and the files's contents are the same. I checked the difference between these files with diff command too:
$ xxd /bin/sh > sh
$ xxd /bin/bash > bash
$ diff sh bash
(+ Yes, $SHELL doesn't means the running shell (I didn't know that when I was testing, and I just wanted to see what happens))
The GNU sources (https://github.com/lattera/glibc/blob/master/sysdeps/posix/system.c) say
/bin/sh
So, whatever /bin/sh is hardlinked to is the shell invoked by std::system() on Linux.
(This is correct, as /bin/sh is expected to be linked to a sane shell capable of doing things with the system.)
According to cppreference.com, std::system
calls the host environment's command processor (e.g. /bin/sh, cmd.exe, command.com)
This means the shell used will depend on the operating system.
On any POSIX OS (including Linux), the shell used by std::system is /bin/sh. (Though as the OP points out, /bin/sh could be a symlink to another shell.)
As for the SHELL environment variable, as has been pointed out in the comments, this environment variable cannot be used to reliably identify the running shell program. SHELL is defined by POSIX to
represent a pathname of the user's preferred command language interpreter
(source)

How to pass a command which contains special characters through SSH?

I would like to run the following command from Jenkins:
ssh -i ~/.ssh/company.pem -o StrictHostKeyChecking=no user#$hostname "supervisorctl start company-$app ; awk -v app=$app '$0 ~ "program:company-"app {p=NR} p && NR==p+6 && /^autostart/ {$0="autostart=true" ; p=0} 1' /etc/supervisord.conf > $$.tmp && sudo mv $$.tmp /etc/supervisord.conf”
This is one of the last steps of a job which creates a CloudFormation stack.
Running the command from the target server's terminal works properly.
In this step, I'd like to ssh to each one of the servers (members of ASG's within the new stack) and search and replace a specific line as shown above in the /etc/supervisord.conf, basically setting one specific service to autostart.
When I run the command I get the following error:
Usage: awk [POSIX or GNU style options] -f progfile [--] file ...
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
I've tried escaping the double quotes but got the same error, any idea what I'm doing wrong?
You are running in to this issue due to the way the shell handles nested quotes. This is a use case for a HERE DOCUMENT or heredoc - A HERE DOCUMENT allows you to write multi-line commands passed through bash without worrying about quotes. The structure is as follows:
$ ssh -t user#server.com <<'END'
command |\
command2 |\
END
<--- Oh yeah, the -t is important to the ssh command as it lets the shell know to behave as if being used interactively, and will avoid warnings and unexpected results.
In your specific case, you should try something like:
$ ssh -t -i ~/.ssh/company.pem -o StrictHostKeyChecking=no user#$hostname <<'END'
supervisorctl start company-$app |\
awk -v app=$app '$0 ~ \"program:company-\"app {p=NR} p && NR==p+6 \
&& /^autostart/ {$0="autostart=true" ; p=0} 1' \
/etc/supervisord.conf > $$.tmp && sudo mv $$.tmp /etc/supervisord.conf
END
Just a note, since I can't be sure about your desired output of the command you are running, be advised to keep track of your own " and ' marks, and to escape them accordingly in your awk command as you would at an interactive terminal. I notice the "'s around program:company and I am confused a bit by them If they are a part of the pattern in the string being searched they will need to be escaped accordingly. P.S.

Creating a negative lookahead in a pgrep/pkill command within a complicated unix command

I'm writing a daemon that will log in to other machines to confirm that a service is running and also start, stop, or kill it. Because of this, the unix commands get a little long and obfuscated.
The basic shape of the commands that are forming are like:
bash -c 'ssh -p 22 user#host.domain.com pgrep -fl "APP.*APP_id=12345"'
Where APP is the name of the remote executable and APP_id is a parameter passed to the application when started.
The executable running on the remote side will be started with something like:
/path/to/APP configs/config.xml -v APP_id=12345 APP_port=2345 APP_priority=7
The exit status of this command is used to determine if the remote service is running or was successfully started or killed.
The problem I'm having is that when testing on my local machine, ssh connects to the local machine to make things easier, but pgrep called this way will also identify the ssh command that the server is running to do the check.
For example, pgrep may return:
26308 ./APP configs/config.xml APP_id=128bb8da-9a0b-474b-a0de-528c9edfc0a5 APP_nodeType=all APP_exportPort=6500 APP_clientPriority=11
27915 ssh -p 22 user#localhost pgrep -fl APP.*APP_id=128bb8da-9a0b-474b-a0de-528c9edfc0a5
So the logical next step was to change the pgrep pattern to exclude 'ssh', but this seems impossible because pgrep does not seem to be compiled with a PCRE version that allows lookaheads, for example:
bash -c -'ssh -p 22 user#localhost preg -fl "\(?!ssh\).*APP.*APP_id=12345"
This will throw a regex error, so as a workaround I was using grep:
bash -c 'ssh -p 22 user#host.domain.com pgrep -fl "APP.*APP_id=12345" \\| grep -v ssh'
This works well for querying with pgrep even though it's a workaround. However, the next step using pkill doesn't work because there's no opportunity for grep to be effective:
bash -c 'ssh -p 22 user#host.domain.com pkill -f "APP.*APP_id=12345"'
Doesn't work well because pkill also kills the ssh connection which causes the exit status to be bad. So, I'm back to modifying my pgrep/pkill pattern and not having much luck.
This environment can be simulated with something simple on a local machine that can ssh to itself without a password (in this case, APP would be 'watch'):
watch echo APP_id=12345
Here is the question simply put: How do I match 'APP' but not 'ssh user#host APP' in pgrep?
It's kind of a workaround, but does the job:
bash -c 'ssh -p 22 user#host.domain.com pgrep -fl "^[^\s]*APP.*APP_id=12345"'
...which only matches commands that have no space before the application name. This isn't entirely complete, because it's possible that the path to the executable may contain a directory with spaces, but without lookaround syntax I haven't thought of another way to make this work.
really old q but!
export VAR="agent.py"; pkill -f .*my$VAR;

Use GDB to debug a C++ program called from a shell script

I have a extremely complicated shell script, within which it calls a C++ program I want to debug via GDB. It is extremely hard to separate this c++ program from the shell since it has a lot of branches and a lot of environmental variables setting.
Is there a way to invoke GDB on this shell script? Looks like gdb requires me to call on a C++ program directly.
In addition to options mentioned by #diverscuba23, you could do the following:
gdb --args bash <script>
(assuming it's a bash script. Else adapt accordingly)
There are two options that you can do:
Invoke GDB directly within the shell script. This would imply that you don't have standard in and standard out redirected.
Run the shell script and then attach the debugger to the already running C++ process like so: gdb progname 1234 where 1234 is the process ID of the running C++ process.
If you need to do things before the program starts running then option 1 would be the better choice, otherwise option 2 is the cleaner way.
Modify the c++ application to print its pid and sleep 30 seconds (perhaps based on environment or an argument). Attach to the running instance with gdb.
I would probably modify the script to always call gdb (and revert this later) or add an option to call gdb. This will almost always be the easiest solution.
The next easiest would be to temporarily move your executable and replace it with a shell script that runs gdb on the moved program. For example, in the directory containing your program:
$ mv program _program
$ (echo "#!/bin/sh"; echo "exec gdb $PWD/_program") > program
$ chmod +x program
Could you just temporarily add gdb to your script?
Although the answers given are valid, sometimes you don't have permissions to change the script to execute gdb or to modify the program to add additional output to attach through pid.
Luckily, there is yet another way through the power of bash
Use ps, grep and awk to pick-out the pid for you after its been executed. You can do this by either wrapping the other script with your own or by just executing a command yourself.
That command might look something like this:
process.sh
#!/usr/bin/env bash
#setup for this example
#this will execute vim (with cmdline options) as a child to bash
#we will attempt to attach to this process
vim ~/.vimrc
To get gdb to attach, we'd just need to execute the following:
gdb --pid $(ps -ef | grep -ve grep | grep vim | awk '{print $2}')
I use ps -ef here to list the processes and their arguments. Sometimes, you'll have multiple instances of a program running and need to further grep down to the one you want
the grep -ve grep is there because the f option to ps will include the next grep in its list. If you don't need the command arguments for additional filtering, don't include the -f option for ps and ignore this piece
grep vim is where we're finding our desired process. If you needed more filtering, you could just do something like grep -E "vim.*vimrc" and filter down to exactly the process that you're trying to attach to
awk '{print $2}' simply outputs just the process' pid to stdout. Use $1 if you're using ps -e instead of ps -ef
My normal setup is to run such script that starts my process in 1 tmux pane and having typed something similar to the above in a bottom pane. That way if I need to adjust the filtering (for whatever reason), I can do it pretty quickly.
Usually though, it will be the same for a specific instance and I want to just attach automatically after its been started. I'll do the following instead:
runGdb.py
#!/usr/bin/env bash
./process.sh &
PID=$(ps -ef | grep -ve grep | grep -E "vim.*vimrc" | awk '{print $2}')
#or
#PID=$(ps -e | grep vim | awk '{print $1}')
gdb --pid $PID
This assumes that the original process can be safely run in the background.