Run multiple tools as single bash script - c++

I am doing different programs in isolation. Let say one command line arg for C++ tool, other one for R. But at first I run command line argument for C++ app, this will gives me a resulting file. Only then I can run another command line for R app, that required resulting file from C++ app.
I may have many different data to be processed. Is there any way to make a bash script to allow looping different tools (C++, R, any other)? So I just sit down and dont manually write many command line arguments?
I would like to go to sleep, while a time consuming loop is making noise in my computer.

Running multiple, different programms in some defined order is the fundamental idea of a (systems) scripting language like bash:
## run those three programms in sequence
first argument
second parameter
third
# same as
first argument; second parameter; third
You can do a lot of fancy things, like redirecting input and output streams:
grep secret secrets.file | grep -V strong | sort > result.file
# pipe | feeds everything from the standard output
# of the programm on the left into
# the standard input of the one on the right
This includes also things like conditionals and of course, loops:
while IFS= read -r -d '' file; do
preprocess "$file"
some_work | generate "$file.output"
done < <(find ./data -type f -name 'source*' -print0)
As you might see, bash is a programming language on its own, with a bit of a weird syntax IMHO.

Related

What does it mean by the last dash in $(gcc -xc++ -E -v -)? [duplicate]

Examples:
Create an ISO image and burn it directly to a CD.
mkisofs -V Photos -r /home/vivek/photos | cdrecord -v dev=/dev/dvdrw -
Change to the previous directory.
cd -
Listen on port 12345 and untar data sent to it.
nc -l -p 12345 | tar xvzf -
What is the purpose of the dash and how do I use it?
If you mean the naked - at the end of the tar command, that's common on many commands that want to use a file.
It allows you to specify standard input or output rather than an actual file name.
That's the case for your first and third example. For example, the cdrecord command is taking standard input (the ISO image stream produced by mkisofs) and writing it directly to /dev/dvdrw.
With the cd command, every time you change directory, it stores the directory you came from. If you do cd with the special - "directory name", it uses that remembered directory instead of a real one. You can easily switch between two directories quite quickly by using that.
Other commands may treat - as a different special value.
It's not magic. Some commands interpret - as the user wanting to read from stdin or write to stdout; there is nothing special about it to the shell.
- means exactly what each command wants it to mean. There are several common conventions, and you've seen examples of most of them in other answers, but none of them are 100% universal.
There is nothing magic about the - character as far as the shell is concerned (except that the shell itself, and some of its built-in commands like cd and echo, use it in conventional ways). Some characters, like \, ', and ", are "magical", having special meanings wherever they appear. These are "shell metacharacters". - is not like that.
To see how a given command uses -, read the documentation for that command.
It means to use the program's standard input stream.
In the case of cd, it means something different: change to the prior working directory.
The magic is in the convention. For millennia, people have used '-' to distinguish options from arguments, and have used '-' in a filename to mean either stdin or stdout, as appropriate. Do not underestimate the power of convention!

How can I redirect output of a Windows C++ program to another executable?

I am experimenting with a brute force attack using C++ against a password protected rar file. I am already able to generate all of the possible 'passwords'. I am just unsure how to automate attempts to extract the archive using each of the generated combinations from my program. I'm on Windows and am trying to do this with WinRar.
One could somewhat easily do something like:
int main (int argc, char** argv)
{
for (;;) {
/* do something */
cout << clever_password << "\0";
}
}
… and then in the shell, simply:
your-clever-password-guesser | \
sed -e 's,'\'','\''"'\''"'\'',g' | \
xargs -0 -n1 -r -t -I {} -- unrar e 'p{}' some-file.rar
Breaking that down:
Print out each password guess with a terminating '\0' character. This allows the password to (potentially) contain things like spaces and tabs that might otherwise “mess up” in the shell.
Ask the stream editor sed to protect you from apostrophes '. Each ' must be encoded as a sequence of '\'' (apos-backslash-apos-apos) or '"'"' (apos-quote-apos-quote-apos) to pass through the shell safely. The s///g pattern replaces every ' with '"'"', but the apostrophes that it, itself is passing to sed are written as '\''. (I mixed the styles of escaping the ' to make it easier for me to distinguish between the apostrophe-escaping for sed and the apostrophe-escaping which sed is adding to the stream of passwords.) One could, instead, alter the strings as they're being printed in the C++ program.
Invoke xargs to run unrar with each password, with the options that mean:
Each password is delimited by \0 (-0)
Use only one at a time (-n1)
Don't run if there isn't anything to do (-r) — e.g. if your program didn't print out any possible passwords at all.
Show the command-line as it's going to be run (-t) — this lets you monitor the guesses as they fly past on your screen
Put the password in place of the somewhat traditional for that purpose symbol {} (-I {})
Then, run the command that follows --
Extract from the RAR file (unrar e …)
With the password given replacing the {} in 'p{}'; the ' here protect against spaces and things that may be in the password
Then, the filename to un-RAR
If you wanted to try to run multiple unrar instances in parallel, you could also insert -P4 into the xargs invocation (e.g. …-I {} -P4 --…) to run 4 instances at a time; adjust this until your machine gets too loaded down to gain any benefits. (Since this is likely disc I/O bound, you might want to make sure to copy the RAR file into a RAM filesystem like /tmp or /run before starting it, if it's a reasonable size, so that you're not waiting on disc I/O as much, but the OS will likely cache the file after a few dozen rounds, so that might not actually help much over the course of a long run.)
This is a brute-force way to do it, but doesn't require as deep a knowledge of programming as, say, using fork/exec/wait to launch unrar processes, or using a rar-enabled library to do it yourself (which would probably yield a significant improvement in speed over launching the executable hundreds or thousands of times)
PS
I realized afterwards that perhaps you're looking for interaction with the actual WinRAR™ program. The above isn't at all helpful for that; but it will enable you to run the command-line unrar repeatedly.
Also, if you're on a Windows system, you'd need to install some of the standard shell utilities — a POSIX-compatible Bourne shell like BASH, sed, and xargs — which might imply something like Cygwin being needed. I don't have any practical experience with Windows systems to give good advice about how to do that, though.
Winrar has an api, though it only supports decompression. This is as simple as one function call from their api to attempt to decompress the file. Follow the link:
http://www.rarlab.com/rar_add.htm
Good luck!

Compounding switch regexes in Vim

I'm working on refactoring a bunch of PHP code for an instructor. The first thing I've decided to do is to update all the SQL files to be written in Drupal SQL coding conventions, i.e., to have all-uppercase keywords. I've written a few regular expressions:
:%s/create table/CREATE TABLE/gi
:%s/create database/CREATE DATABASE/gi
:%s/primary key/PRIMARY KEY/gi
:%s/auto_increment/AUTO_INCREMENT/gi
:%s/not null/NOT NULL/gi
Okay, that's a start. Now I just open every SQL file in Vim, run all five regular expressions, and save. This feels like five times the work it should be. Can they be compounded in to one obnoxiously long but easily copy-pastable regex?
why do you have to do it in vim? how about sed/awk?
e.g. with sed
sed -e 's/create table/\U&/g' -e's/not null/\U&/g' -e 's/.../\U&/' *.sql
btw, in vi you may do
:%s/create table/\U&/g
to change case, well save some typing.
update
if you really want a long command to execute in vi, maybe you could try:
:%s/create table\|create database\|foo\|bar\|blah/\U&/g
Open the file containing that substitution commands.
Copy its contents (to the unnamed register, by default):
:%y
If there is only one file where the substitutions should be
performed, open it as usual and run the contents of that register
as a Normal mode command:
:#"
If there are several files to edit automatically, open those
files as arguments:
:args *.sql
Execute the yanked substitutions for each file in the argument list:
:argdo #"|up
(The :update command running after the substitutions, writes
the buffer to file if it has been changed.)
While sed can handle what you want (hovewer it can be interactive as you requestred by flag 'i'), vim still much powerfull. Once I needed to change last argument in some function call in 1M SLOC code base. The arguments could be in one line or in several lines. In vim I achieved it pretty easy.
You can open all php files in vim at once:
vim *.php
After that run in ex mode:
:bufdo! %s/create table/CREATE TABLE/gi
Repeat the rest of commands. At the end save all the files and exit vim:
:xall

Apply regular expression substitution globally to many files with a script

I want to apply a certain regular expression substitution globally to about 40 Javascript files in and under a directory. I'm a vim user, but doing this by hand can be tedious and error-prone, so I'd like to automate it with a script.
I tried sed, but handling more than one line at a time is awkward, especially if there is no limit to how many lines the pattern might match.
I also tried this script (on a single file, for testing):
ex $1 <<EOF
gs/,\(\_\s*[\]})]\)/\1/
EOF
The pattern will eliminate a trailing comma in any Perl/Ruby-style list, so that "[a, b, c,]" will come out as "[a, b, c]" in order to satisfy Internet Explorer, which alone among browsers, chokes on such lists.
The pattern works beautifully in vim but does nothing if I run it in ex, as per the above script.
Can anyone see what I might be missing?
You asked for a script, but you mentioned that you are vim user. I tend to do project-wide find and replace inside of vim, like so:
:args **/*.js | argdo %s/,\(\_\s*[\]})]\)/\1/ge | update
This is very similar to the :bufdo solution mentioned by another commenter, but it will use your args list rather than your buflist (and thus doesn't require a brand new vim session nor for you to be careful about closing buffers you don't want touched).
:args **/*.js - sets your arglist to contain all .js files in this directory and subdirectories
| - pipe is vim's command separator, letting us have multiple commands on one line
:argdo - run the following command(s) on all arguments. it will "swallow" subsequent pipes
% - a range representing the whole file
:s - substitute command, which you already know about
:s_flags, ge - global (substitute as many times per line as possible) and suppress errors (i.e. "No match")
| - this pipe is "swallowed" by the :argdo, so the following command also operates once per argument
:update - like :write but only when the buffer has been modified
This pattern will obviously work for any vim command which you want to run on multiple files, so it's a handy one to keep in mind. For example, I like to use it to remove trailing whitespace (%s/\s\+$//), set uniform line-endings (set ff=unix) or file encoding (set filencoding=utf8), and retab my files.
1) Open all the files with vim:
bash$ vim $(find . -name '*.js')
2) Apply substitute command to all files:
:bufdo %s/,\(\_\s*[\]})]\)/\1/ge
3) Save all the files and quit:
:wall
:q
I think you'll need to recheck your search pattern, it doesn't look right. I think where you have \_\s* you should have \_s* instead.
Edit: You should also use the /ge options for the :s... command (I've added these above).
You can automate the actions of both vi and ex by passing the argument +'command' from the command line, which enables them to be used as text filters.
In your situation, the following command should work fine:
find /path/to/dir -name '*.js' | xargs ex +'%s/,\(\_\s*[\]})]\)/\1/g' +'wq!'
you can use a combination of the find command and sed
find /path -type f -iname "*.js" -exec sed -i.bak 's/,[ \t]*]/]/' "{}" +;
If you are on windows, Notepad++ allows you to run simple regexes on all opened files.
Search for ,\s*\] and replace with ]
should work for the type of lists you describe.

Controlling shell command line wildcard expansion in C or C++

I'm writing a program, foo, in C++. It's typically invoked on the command line like this:
foo *.txt
My main() receives the arguments in the normal way. On many systems, argv[1] is literally *.txt, and I have to call system routines to do the wildcard expansion. On Unix systems, however, the shell expands the wildcard before invoking my program, and all of the matching filenames will be in argv.
Suppose I wanted to add a switch to foo that causes it to recurse into subdirectories.
foo -a *.txt
would process all text files in the current directory and all of its subdirectories.
I don't see how this is done, since, by the time my program gets a chance to see the -a, then shell has already done the expansion and the user's *.txt input is lost. Yet there are common Unix programs that work this way. How do they do it?
In Unix land, how can I control the wildcard expansion?
(Recursing through subdirectories is just one example. Ideally, I'm trying to understand the general solution to controlling the wildcard expansion.)
You program has no influence over the shell's command line expansion. Which program will be called is determined after all the expansion is done, so it's already too late to change anything about the expansion programmatically.
The user calling your program, on the other hand, has the possibility to create whatever command line he likes. Shells allow you to easily prevent wildcard expansion, usually by putting the argument in single quotes:
program -a '*.txt'
If your program is called like that it will receive two parameters -a and *.txt.
On Unix, you should just leave it to the user to manually prevent wildcard expansion if it is not desired.
As the other answers said, the shell does the wildcard expansion - and you stop it from doing so by enclosing arguments in quotes.
Note that options -R and -r are usually used to indicate recursive - see cp, ls, etc for examples.
Assuming you organize things appropriately so that wildcards are passed to your program as wildcards and you want to do recursion, then POSIX provides routines to help:
nftw - file tree walk (recursive access).
fnmatch, glob, wordexp - to do filename matching and expansion
There is also ftw, which is very similar to nftw but it is marked 'obsolescent' so new code should not use it.
Adrian asked:
But I can say ls -R *.txt without single quotes and get a recursive listing. How does that work?
To adapt the question to a convenient location on my computer, let's review:
$ ls -F | grep '^m'
makefile
mapmain.pl
minimac.group
minimac.passwd
minimac_13.terminal
mkmax.sql.bz2
mte/
$ ls -R1 m*
makefile
mapmain.pl
minimac.group
minimac.passwd
minimac_13.terminal
mkmax.sql.bz2
mte:
multithread.ec
multithread.ec.original
multithread2.ec
$
So, I have a sub-directory 'mte' that contains three files. And I have six files with names that start 'm'.
When I type 'ls -R1 m*', the shell notes the metacharacter '*' and uses its equivalent of glob() or wordexp() to expand that into the list of names:
makefile
mapmain.pl
minimac.group
minimac.passwd
minimac_13.terminal
mkmax.sql.bz2
mte
Then the shell arranges to run '/bin/ls' with 9 arguments (program name, option -R1, plus 7 file names and terminating null pointer).
The ls command notes the options (recursive and single-column output), and gets to work.
The first 6 names (as it happens) are simple files, so there is nothing recursive to do.
The last name is a directory, so ls prints its name and its contents, invoking its equivalent of nftw() to do the job.
At this point, it is done.
This uncontrived example doesn't show what happens when there are multiple directories, and so the description above over-simplifies the processing.
Specifically, ls processes the non-directory names first, and then processes the directory names in alphabetic order (by default), and does a depth-first scan of each directory.
foo -a '*.txt'
Part of the shell's job (on Unix) is to expand command line wildcard arguments. You prevent this with quotes.
Also, on Unix systems, the "find" command does what you want:
find . -name '*.txt'
will list all files recursively from the current directory down.
Thus, you could do
foo `find . -name '*.txt'`
I wanted to point out another way to turn off wildcard expansion. You can tell your shell to stop expanding wildcards with the the noglob option.
With bash use set -o noglob:
> touch a b c
> echo *
a b c
> set -o noglob
> echo *
*
And with csh, use set noglob:
> echo *
a b c
> set noglob
> echo *
*