What does it mean by the last dash in $(gcc -xc++ -E -v -)? [duplicate] - c++

Examples:
Create an ISO image and burn it directly to a CD.
mkisofs -V Photos -r /home/vivek/photos | cdrecord -v dev=/dev/dvdrw -
Change to the previous directory.
cd -
Listen on port 12345 and untar data sent to it.
nc -l -p 12345 | tar xvzf -
What is the purpose of the dash and how do I use it?

If you mean the naked - at the end of the tar command, that's common on many commands that want to use a file.
It allows you to specify standard input or output rather than an actual file name.
That's the case for your first and third example. For example, the cdrecord command is taking standard input (the ISO image stream produced by mkisofs) and writing it directly to /dev/dvdrw.
With the cd command, every time you change directory, it stores the directory you came from. If you do cd with the special - "directory name", it uses that remembered directory instead of a real one. You can easily switch between two directories quite quickly by using that.
Other commands may treat - as a different special value.

It's not magic. Some commands interpret - as the user wanting to read from stdin or write to stdout; there is nothing special about it to the shell.

- means exactly what each command wants it to mean. There are several common conventions, and you've seen examples of most of them in other answers, but none of them are 100% universal.
There is nothing magic about the - character as far as the shell is concerned (except that the shell itself, and some of its built-in commands like cd and echo, use it in conventional ways). Some characters, like \, ', and ", are "magical", having special meanings wherever they appear. These are "shell metacharacters". - is not like that.
To see how a given command uses -, read the documentation for that command.

It means to use the program's standard input stream.
In the case of cd, it means something different: change to the prior working directory.

The magic is in the convention. For millennia, people have used '-' to distinguish options from arguments, and have used '-' in a filename to mean either stdin or stdout, as appropriate. Do not underestimate the power of convention!

Related

replace part of file name with wrong encoding

Need some guidance how to solve this one. Have 10 000s of files in multiple subfolders where the encoding got screwed up. Via ls command I see a filename named like this 'F'$'\366''ljesedel.pdf', that includes the ' at beginning and end. That's just one example where the Swedish characters åäö got wrong, in this example this should have been 'Följesedel.pdf'. If If I run
#>find .
Then I see a list of files like this:
./F?ljesedel.pdf
Not the same encoding. How on earth solving this one? The most obvious ways:
myvar='$'\366''
char="ö"
find . -name *$myvar* -exec rename 's/$myvar/ö' {} \;
and other possible ways fails since
find . -name cannot find it due to the ? instead of the "real" characters " '$'\366'' "
Any suggestions or guidance would be very much appreciated.
The first question is what encoding your terminal expects. Make sure that is UTF-8.
Then you need to find what bytes the actual filename contains, not just what something might display it as. You can do this with a perl oneliner like follows, run in the directory containing the file:
perl -E'opendir my $dh, "."; printf "%s: %vX\n", $_, $_ for grep { m/jesedel\.pdf/ } readdir $dh'
This will output the filename interpreted as UTF-8 bytes (if you've set your terminal to that) followed by the hex bytes it actually contains.
Using that you can determine what your search pattern should be. Your replacement must be the UTF-8 encoded representation of ö, which it will be by default as part of the command arguments if your terminal is set to that.
I'm not an expert - but it might not be a problem with the file name (which seems to hold the correct Unicode file name) - but with the way ls (and many other utilities) show the name to the terminal.
I was able to show the correct name by setting the terminal character encoding to Unicode. Also I've noticed the GUI programs (file manager, etc), were able to show the correct file name.
Gnome Terminal: "Terminal .. set character encoding - Unicode UTF8
It is still a challenge with many utilities to 'select' those files (e.g., REGEXP, wildcard). In few cases, you will have to select those character using '*' pattern. If this is a major issue considering using Ascii only - may be use the 'o' instead of 'ö'. Not sure if this is acceptable.

bulk file renaming in bash, to remove name with spaces, leaving trailing digits

Can a bash/shell expert help me in this? Each time I use PDF to split large pdf file (say its name is X.pdf) into separate pages, where each page is one pdf file, it creates files with this pattern
"X 1.pdf"
"X 2.pdf"
"X 3.pdf" etc...
The file name "X" above is the original file name, which can be anything. It then adds one space after the name, then the page number. Page numbers always start from 1 and up to how many pages. There is no option in adobe PDF to change this.
I need to run a shell command to simply remove/strip out all the "X " part, and just leave the digits, like this
1.pdf
2.pdf
3.pdf
....
100.pdf ...etc..
Not being good in pattern matching, not sure what regular expression I need.
I know I need something like
for i in *.pdf; do mv "$i$" ........; done
And it is the ....... part I do not know how to do.
This only needs to run on Linux/Unix system.
Use sed..
for i in *.pdf; do mv "$i" $(sed 's/.*[[:blank:]]//' <<< "$i"); done
And it would be simple through rename
rename 's/.*\s//' *.pdf
You can remove everything up to (including) the last space in the variable with this:
${i##* }
That's "star space" after the double hash, meaning "anything followed by space". ${i#* } would remove up to the first space.
So run this to check:
for i in *.pdf; do echo mv -i -- "$i" "${i##* }" ; done
and remove the echo if it looks good. The -i suggested by Gordon Davisson will prompt you before overwriting, and -- signifies end of options, which prevents things from blowing up if you ever have filenames starting with -.
If you just want to do bulk renaming of files (or directories) and don't mind using external tools, then here's mine: rnm
The command to do what you want would be:
rnm -rs '/.*\s//' *.pdf
.*\s selects the part before (and with) the last white space and replaces it with empty string.
Note:
It doesn't overwrite any existing files (throws warning if it finds an existing file with the target name).
And this operation is failsafe. You can get back the changes made by last rnm command with rnm -u.
Here's a list of documents for rnm.

Run multiple tools as single bash script

I am doing different programs in isolation. Let say one command line arg for C++ tool, other one for R. But at first I run command line argument for C++ app, this will gives me a resulting file. Only then I can run another command line for R app, that required resulting file from C++ app.
I may have many different data to be processed. Is there any way to make a bash script to allow looping different tools (C++, R, any other)? So I just sit down and dont manually write many command line arguments?
I would like to go to sleep, while a time consuming loop is making noise in my computer.
Running multiple, different programms in some defined order is the fundamental idea of a (systems) scripting language like bash:
## run those three programms in sequence
first argument
second parameter
third
# same as
first argument; second parameter; third
You can do a lot of fancy things, like redirecting input and output streams:
grep secret secrets.file | grep -V strong | sort > result.file
# pipe | feeds everything from the standard output
# of the programm on the left into
# the standard input of the one on the right
This includes also things like conditionals and of course, loops:
while IFS= read -r -d '' file; do
preprocess "$file"
some_work | generate "$file.output"
done < <(find ./data -type f -name 'source*' -print0)
As you might see, bash is a programming language on its own, with a bit of a weird syntax IMHO.

How can I redirect output of a Windows C++ program to another executable?

I am experimenting with a brute force attack using C++ against a password protected rar file. I am already able to generate all of the possible 'passwords'. I am just unsure how to automate attempts to extract the archive using each of the generated combinations from my program. I'm on Windows and am trying to do this with WinRar.
One could somewhat easily do something like:
int main (int argc, char** argv)
{
for (;;) {
/* do something */
cout << clever_password << "\0";
}
}
… and then in the shell, simply:
your-clever-password-guesser | \
sed -e 's,'\'','\''"'\''"'\'',g' | \
xargs -0 -n1 -r -t -I {} -- unrar e 'p{}' some-file.rar
Breaking that down:
Print out each password guess with a terminating '\0' character. This allows the password to (potentially) contain things like spaces and tabs that might otherwise “mess up” in the shell.
Ask the stream editor sed to protect you from apostrophes '. Each ' must be encoded as a sequence of '\'' (apos-backslash-apos-apos) or '"'"' (apos-quote-apos-quote-apos) to pass through the shell safely. The s///g pattern replaces every ' with '"'"', but the apostrophes that it, itself is passing to sed are written as '\''. (I mixed the styles of escaping the ' to make it easier for me to distinguish between the apostrophe-escaping for sed and the apostrophe-escaping which sed is adding to the stream of passwords.) One could, instead, alter the strings as they're being printed in the C++ program.
Invoke xargs to run unrar with each password, with the options that mean:
Each password is delimited by \0 (-0)
Use only one at a time (-n1)
Don't run if there isn't anything to do (-r) — e.g. if your program didn't print out any possible passwords at all.
Show the command-line as it's going to be run (-t) — this lets you monitor the guesses as they fly past on your screen
Put the password in place of the somewhat traditional for that purpose symbol {} (-I {})
Then, run the command that follows --
Extract from the RAR file (unrar e …)
With the password given replacing the {} in 'p{}'; the ' here protect against spaces and things that may be in the password
Then, the filename to un-RAR
If you wanted to try to run multiple unrar instances in parallel, you could also insert -P4 into the xargs invocation (e.g. …-I {} -P4 --…) to run 4 instances at a time; adjust this until your machine gets too loaded down to gain any benefits. (Since this is likely disc I/O bound, you might want to make sure to copy the RAR file into a RAM filesystem like /tmp or /run before starting it, if it's a reasonable size, so that you're not waiting on disc I/O as much, but the OS will likely cache the file after a few dozen rounds, so that might not actually help much over the course of a long run.)
This is a brute-force way to do it, but doesn't require as deep a knowledge of programming as, say, using fork/exec/wait to launch unrar processes, or using a rar-enabled library to do it yourself (which would probably yield a significant improvement in speed over launching the executable hundreds or thousands of times)
PS
I realized afterwards that perhaps you're looking for interaction with the actual WinRAR™ program. The above isn't at all helpful for that; but it will enable you to run the command-line unrar repeatedly.
Also, if you're on a Windows system, you'd need to install some of the standard shell utilities — a POSIX-compatible Bourne shell like BASH, sed, and xargs — which might imply something like Cygwin being needed. I don't have any practical experience with Windows systems to give good advice about how to do that, though.
Winrar has an api, though it only supports decompression. This is as simple as one function call from their api to attempt to decompress the file. Follow the link:
http://www.rarlab.com/rar_add.htm
Good luck!

Controlling shell command line wildcard expansion in C or C++

I'm writing a program, foo, in C++. It's typically invoked on the command line like this:
foo *.txt
My main() receives the arguments in the normal way. On many systems, argv[1] is literally *.txt, and I have to call system routines to do the wildcard expansion. On Unix systems, however, the shell expands the wildcard before invoking my program, and all of the matching filenames will be in argv.
Suppose I wanted to add a switch to foo that causes it to recurse into subdirectories.
foo -a *.txt
would process all text files in the current directory and all of its subdirectories.
I don't see how this is done, since, by the time my program gets a chance to see the -a, then shell has already done the expansion and the user's *.txt input is lost. Yet there are common Unix programs that work this way. How do they do it?
In Unix land, how can I control the wildcard expansion?
(Recursing through subdirectories is just one example. Ideally, I'm trying to understand the general solution to controlling the wildcard expansion.)
You program has no influence over the shell's command line expansion. Which program will be called is determined after all the expansion is done, so it's already too late to change anything about the expansion programmatically.
The user calling your program, on the other hand, has the possibility to create whatever command line he likes. Shells allow you to easily prevent wildcard expansion, usually by putting the argument in single quotes:
program -a '*.txt'
If your program is called like that it will receive two parameters -a and *.txt.
On Unix, you should just leave it to the user to manually prevent wildcard expansion if it is not desired.
As the other answers said, the shell does the wildcard expansion - and you stop it from doing so by enclosing arguments in quotes.
Note that options -R and -r are usually used to indicate recursive - see cp, ls, etc for examples.
Assuming you organize things appropriately so that wildcards are passed to your program as wildcards and you want to do recursion, then POSIX provides routines to help:
nftw - file tree walk (recursive access).
fnmatch, glob, wordexp - to do filename matching and expansion
There is also ftw, which is very similar to nftw but it is marked 'obsolescent' so new code should not use it.
Adrian asked:
But I can say ls -R *.txt without single quotes and get a recursive listing. How does that work?
To adapt the question to a convenient location on my computer, let's review:
$ ls -F | grep '^m'
makefile
mapmain.pl
minimac.group
minimac.passwd
minimac_13.terminal
mkmax.sql.bz2
mte/
$ ls -R1 m*
makefile
mapmain.pl
minimac.group
minimac.passwd
minimac_13.terminal
mkmax.sql.bz2
mte:
multithread.ec
multithread.ec.original
multithread2.ec
$
So, I have a sub-directory 'mte' that contains three files. And I have six files with names that start 'm'.
When I type 'ls -R1 m*', the shell notes the metacharacter '*' and uses its equivalent of glob() or wordexp() to expand that into the list of names:
makefile
mapmain.pl
minimac.group
minimac.passwd
minimac_13.terminal
mkmax.sql.bz2
mte
Then the shell arranges to run '/bin/ls' with 9 arguments (program name, option -R1, plus 7 file names and terminating null pointer).
The ls command notes the options (recursive and single-column output), and gets to work.
The first 6 names (as it happens) are simple files, so there is nothing recursive to do.
The last name is a directory, so ls prints its name and its contents, invoking its equivalent of nftw() to do the job.
At this point, it is done.
This uncontrived example doesn't show what happens when there are multiple directories, and so the description above over-simplifies the processing.
Specifically, ls processes the non-directory names first, and then processes the directory names in alphabetic order (by default), and does a depth-first scan of each directory.
foo -a '*.txt'
Part of the shell's job (on Unix) is to expand command line wildcard arguments. You prevent this with quotes.
Also, on Unix systems, the "find" command does what you want:
find . -name '*.txt'
will list all files recursively from the current directory down.
Thus, you could do
foo `find . -name '*.txt'`
I wanted to point out another way to turn off wildcard expansion. You can tell your shell to stop expanding wildcards with the the noglob option.
With bash use set -o noglob:
> touch a b c
> echo *
a b c
> set -o noglob
> echo *
*
And with csh, use set noglob:
> echo *
a b c
> set noglob
> echo *
*