I am confused by the usage of brackets, parentheses, curly braces in Bash, as well as the difference between their double or single forms. Is there a clear explanation?
In Bash, test and [ are shell builtins.
The double bracket, which is a shell keyword, enables additional functionality. For example, you can use && and || instead of -a and -o and there's a regular expression matching operator =~.
Also, in a simple test, double square brackets seem to evaluate quite a lot quicker than single ones.
$ time for ((i=0; i<10000000; i++)); do [[ "$i" = 1000 ]]; done
real 0m24.548s
user 0m24.337s
sys 0m0.036s
$ time for ((i=0; i<10000000; i++)); do [ "$i" = 1000 ]; done
real 0m33.478s
user 0m33.478s
sys 0m0.000s
The braces, in addition to delimiting a variable name are used for parameter expansion so you can do things like:
Truncate the contents of a variable
$ var="abcde"; echo ${var%d*}
abc
Make substitutions similar to sed
$ var="abcde"; echo ${var/de/12}
abc12
Use a default value
$ default="hello"; unset var; echo ${var:-$default}
hello
and several more
Also, brace expansions create lists of strings which are typically iterated over in loops:
$ echo f{oo,ee,a}d
food feed fad
$ mv error.log{,.OLD}
(error.log is renamed to error.log.OLD because the brace expression
expands to "mv error.log error.log.OLD")
$ for num in {000..2}; do echo "$num"; done
000
001
002
$ echo {00..8..2}
00 02 04 06 08
$ echo {D..T..4}
D H L P T
Note that the leading zero and increment features weren't available before Bash 4.
Thanks to gboffi for reminding me about brace expansions.
Double parentheses are used for arithmetic operations:
((a++))
((meaning = 42))
for ((i=0; i<10; i++))
echo $((a + b + (14 * c)))
and they enable you to omit the dollar signs on integer and array variables and include spaces around operators for readability.
Single brackets are also used for array indices:
array[4]="hello"
element=${array[index]}
Curly brace are required for (most/all?) array references on the right hand side.
ephemient's comment reminded me that parentheses are also used for subshells. And that they are used to create arrays.
array=(1 2 3)
echo ${array[1]}
2
A single bracket ([) usually actually calls a program named [; man test or man [ for more info. Example:
$ VARIABLE=abcdef
$ if [ $VARIABLE == abcdef ] ; then echo yes ; else echo no ; fi
yes
The double bracket ([[) does the same thing (basically) as a single bracket, but is a bash builtin.
$ VARIABLE=abcdef
$ if [[ $VARIABLE == 123456 ]] ; then echo yes ; else echo no ; fi
no
Parentheses (()) are used to create a subshell. For example:
$ pwd
/home/user
$ (cd /tmp; pwd)
/tmp
$ pwd
/home/user
As you can see, the subshell allowed you to perform operations without affecting the environment of the current shell.
(a) Braces ({}) are used to unambiguously identify variables. Example:
$ VARIABLE=abcdef
$ echo Variable: $VARIABLE
Variable: abcdef
$ echo Variable: $VARIABLE123456
Variable:
$ echo Variable: ${VARIABLE}123456
Variable: abcdef123456
(b) Braces are also used to execute a sequence of commands in the current shell context, e.g.
$ { date; top -b -n1 | head ; } >logfile
# 'date' and 'top' output are concatenated,
# could be useful sometimes to hunt for a top loader )
$ { date; make 2>&1; date; } | tee logfile
# now we can calculate the duration of a build from the logfile
There is a subtle syntactic difference with ( ), though (see bash reference) ; essentially, a semicolon ; after the last command within braces is a must, and the braces {, } must be surrounded by spaces.
Brackets
if [ CONDITION ] Test construct
if [[ CONDITION ]] Extended test construct
Array[1]=element1 Array initialization
[a-z] Range of characters within a Regular Expression
$[ expression ] A non-standard & obsolete version of $(( expression )) [1]
[1] http://wiki.bash-hackers.org/scripting/obsolete
Curly Braces
${variable} Parameter substitution
${!variable} Indirect variable reference
{ command1; command2; . . . commandN; } Block of code
{string1,string2,string3,...} Brace expansion
{a..z} Extended brace expansion
{} Text replacement, after find and xargs
Parentheses
( command1; command2 ) Command group executed within a subshell
Array=(element1 element2 element3) Array initialization
result=$(COMMAND) Command substitution, new style
>(COMMAND) Process substitution
<(COMMAND) Process substitution
Double Parentheses
(( var = 78 )) Integer arithmetic
var=$(( 20 + 5 )) Integer arithmetic, with variable assignment
(( var++ )) C-style variable increment
(( var-- )) C-style variable decrement
(( var0 = var1<98?9:21 )) C-style ternary operation
I just wanted to add these from TLDP:
~:$ echo $SHELL
/bin/bash
~:$ echo ${#SHELL}
9
~:$ ARRAY=(one two three)
~:$ echo ${#ARRAY}
3
~:$ echo ${TEST:-test}
test
~:$ echo $TEST
~:$ export TEST=a_string
~:$ echo ${TEST:-test}
a_string
~:$ echo ${TEST2:-$TEST}
a_string
~:$ echo $TEST2
~:$ echo ${TEST2:=$TEST}
a_string
~:$ echo $TEST2
a_string
~:$ export STRING="thisisaverylongname"
~:$ echo ${STRING:4}
isaverylongname
~:$ echo ${STRING:6:5}
avery
~:$ echo ${ARRAY[*]}
one two one three one four
~:$ echo ${ARRAY[*]#one}
two three four
~:$ echo ${ARRAY[*]#t}
one wo one hree one four
~:$ echo ${ARRAY[*]#t*}
one wo one hree one four
~:$ echo ${ARRAY[*]##t*}
one one one four
~:$ echo $STRING
thisisaverylongname
~:$ echo ${STRING%name}
thisisaverylong
~:$ echo ${STRING/name/string}
thisisaverylongstring
The difference between test, [ and [[ is explained in great details in the BashFAQ.
(Note: The link shows many examples for comparison)
To cut a long story short: test implements the old, portable syntax of
the command. In almost all shells (the oldest Bourne shells are the
exception), [ is a synonym for test (but requires a final argument of
]). Although all modern shells have built-in implementations of [,
there usually still is an external executable of that name, e.g.
/bin/[.
[[ is a new, improved version of it, and it is a keyword, not a program.
This has beneficial effects on the ease of use, as shown below. [[ is
understood by KornShell and BASH (e.g. 2.03), but not by the older
POSIX or BourneShell.
And the conclusion:
When should the new test command [[ be used, and when the old one [?
If portability/conformance to POSIX or the BourneShell is a concern, the old syntax should
be used. If on the other hand the script requires BASH, Zsh, or KornShell,
the new syntax is usually more flexible.
Parentheses in function definition
Parentheses () are being used in function definition:
function_name () { command1 ; command2 ; }
That is the reason you have to escape parentheses even in command parameters:
$ echo (
bash: syntax error near unexpected token `newline'
$ echo \(
(
$ echo () { command echo The command echo was redefined. ; }
$ echo anything
The command echo was redefined.
Some common and handy uses for brackets, parenthesis, and braces
As mentioned above, sometimes you want a message displayed without losing the return value. This is a handy snippet:
$ [ -f go.mod ] || { echo 'File not found' && false; }
This produced no output and a 0 (true) return value if the file go.mod exists in the current directory. Test the result:
$ echo $?
0
If the file does not exist, you get the message but also a return value of 1 (false), which can also be tested:
$ [ -f fake_file ] || { echo 'File not found'; false; }
File not found
$ echo $?
1
You can also simply create a function to check if a file exists:
fileexists() { [ -f "$1" ]; }
or if a file is readable (not corrupted, have permissions, etc.):
canread() { [ -r "$1" ]; }
or if it is a directory:
isdir() { [ -d "$1" ]; }
or is writable for the current user:
canwrite() { [ -w "$1" ]; }
or if a file exists and is not empty (like a log file with content...)
isempty() { [ -s "$1" ]; }
There are more details at: TLDP
You can also see if a program exists and is available on the path:
exists () { command -v $1 > /dev/null 2>&1; }
This is useful in scripts, for example:
# gitit does an autosave commit to the current
# if Git is installed and available.
# If git is not available, it will use brew
# (on macOS) to install it.
#
# The first argument passed, if any, is used as
# the commit message; otherwise the default is used.
gitit() {
$(exists git) && {
git add --all;
git commit -m "${1:-'GitBot: dev progress autosave'}";
git push;
} || brew install git;
}
Additional info about How to use parentheses to group and expand expressions:
(it is listed on the link syntax-brackets)
Some main points in there:
Group commands in a sub-shell: ( )
(list)
Group commands in the current shell: { }
{ list; }
Test - return the binary result of an expression: [[ ]]
[[ expression ]]
Arithmetic expansion
The format for Arithmetic expansion is:
$(( expression ))
The format for a simple Arithmetic Evaluation is:
(( expression ))
Combine multiple expressions
( expression )
(( expr1 && expr2 ))
Truncate the contents of a variable
$ var="abcde"; echo ${var%d*}
abc
Make substitutions similar to sed
$ var="abcde"; echo ${var/de/12}
abc12
Use a default value
$ default="hello"; unset var; echo ${var:-$default}
hello
Related
I'm modifying a script used to test a report writer. The report writer takes optional --from and --to flags to specify the start and end dates. I'd like to modify the script function that starts up the report writer so that its date arguments are also optional.
Sadly, there are already optional arguments to the function, so I'm trying to test whether an argument is in the right format for a date (we use nn/nn/nnnn).
So, I'm echoing the candidate string and checking with grep whether it is in the correct format. Except it doesn't work.
Here is an extract from the function
# If the next argument looks like a date, consume it and use it to define
# the report start date
looksLikeDate=$(echo $1 | grep -e '[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]')
echo from -
echo \$1: \"$1\"
echo looksLikeDate: \"$looksLikeDate\"
if [ -n $looksLikeDate ]
then
echo "-n: true"
FROMFLAG="--from $1"
shift 1
else
echo "-n : false"
FROMFLAG=""
fi
# If the next argument looks like a date, consume it and use it to define
# the report end date
looksLikeDate=$(echo $1 | grep -e '[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]')
echo to -
echo \$1: \"$1\"
echo looksLikeDate: \"$looksLikeDate\"
if [ -n $looksLikeDate ]
then
echo "-n: true"
TOFLAG="--to $1"
shift 1
else
echo "-n: false"
TOFLAG=""
fi
...and here is the output with dates...
from -
$1: "09/02/2018"
looksLikeDate: "09/02/2018"
-n: true
to -
$1: "09/02/2018"
looksLikeDate: "09/02/2018"
-n: true
...and without...
from -
$1: ""
looksLikeDate: ""
-n: true
to -
$1: ""
looksLikeDate: ""
-n: true
...what have I missed? I'd expect that since looksLikeDate is demonstrable empty [ -n $looksLikeDate ] would return false and the code would go down the else path of the if statement.
Update:
Since posting, it occurs to me that the easiest thing is to not to look at the arguments in the function and get callers to pass the --from and --to with the arguments so that I can simply pass $* to the report writer as is done for the existing optional arguments.
Thank you very much for reading; I'm still curious as to why the posted code doesn't work.
That's because you're not quoting your variables! Use more quotes!
So here's what's happening: when Bash sees
[ -n $looksLikeDate ]
it performs parameter expansion, glob expansion, quote removal, etc., and finally sees this (I put one token on each line):
[
-n
]
and you see that the $looksLikeDate part is missing because the parameter $looksLikeDate expands to the empty string before the quote removal step. Then Bash executes the builtin [, and with the closing ], this is equivalent to the following command:
test -n
Now looking at the reference manual for the test builtin, you'll read:
1 argument
The expression is true if, and only if, the argument is not null.
And here, the argument is -n, hence not nil, hence the expression is true.
So remember:
Use more quotes! quote all your variable expansions!
This specific line should look like:
[ -n "$looksLikeDate" ]
Another possibility is to use the [[ keyword:
[[ -n $looksLikeDate ]]
But anyway, quote all your expansion!
Also, you don't need the external tool grep, you can use Bash's internal regex engine or, better yet:
if [[ $1 = [[:digit:]][[:digit:]]/[[:digit:]][[:digit:]]/[[:digit:]][[:digit:]][[:digit:]][[:digit:]] ]]; then
which is a bit long, so use a variable:
date_pattern="[[:digit:]][[:digit:]]/[[:digit:]][[:digit:]]/[[:digit:]][[:digit:]][[:digit:]][[:digit:]]"
if [[ $1 = $date_pattern ]]; then
(and here you mustn't quote the right hand side $date_pattern).
I have a backup tool that takes database backup daily and stores them with the following format:
*_DATE_*.*.sql.gz
with DATE being in YYYY-MM-DD format.
How could I delete old files (by comparing YYYY-MM-DD in the filenames) matching the pattern above, while leaving only the newest one.
Example:
wordpress_2020-01-27_06h25m.Monday.sql.gz
wordpress_2020-01-28_06h25m.Tuesday.sql.gz
wordpress_2020-01-29_06h25m.Wednesday.sql.gz
Ath the end only the last file, meaning wordpress_2020-01-29_06h25m.Wednesday.sql.gz should remain.
Assuming:
The preceding substring left to _DATE_ portion does not contain underscores.
The filenames do not contain newline characters.
Then would you try the following:
for f in *.sql.gz; do
echo "$f"
done | sort -t "_" -k 2 | head -n -1 | xargs rm --
If your head and cut commands support -z option, following code will be more robust against special characters in the filenames:
for f in *.sql.gz; do
[[ $f =~ _([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2})_ ]] && \
printf "%s\t%s\0" "${BASH_REMATCH[1]}" "$f"
done | sort -z | head -z -n -1 | cut -z -f 2- | xargs -0 rm --
It makes use of the NUL character as a line delimiter and allows any special characters in the filenames.
It first extracts the DATE portion from the filename, then prepend it to the filename as a first field separated by a tab character.
Then it sorts the files with the DATE string, exclude the last (newest) one, then retrieve the filename cutting the first field off, then remove those files.
I found this in another question. Although it serves the purpose, but it does not handle the files based on their filenames.
ls -tp | grep -v '/$' | tail -n +2 | xargs -I {} rm -- {}
Since the pattern (glob) you present us is very generic, we have to make an assumption here.
assumption: the date pattern, is the first sequence that matches the regex [0-9]{4}-[0-9]{2}-[0-9]{2}
Files are of the form: constant_string_<DATE>_*.sql.gz
a=( *.sql.gz )
unset a[${#a[#]}-1]
rm "${a[#]}"
Files are of the form: *_<DATE>_*.sql.gz
Using this, it is easily done in the following way:
a=( *.sql.gz );
cnt=0; ref="0000-00-00"; for f in "${a[#]}"; do
[[ "$f" =~ [0-9]{4}(-[0-9]{2}){2} ]] \
&& [[ "$BASH_REMATCH" > "$ref" ]] \
&& ref="${BASH_REMATCH}" && refi=$cnt
((++cnt))
done
unset a[cnt]
rm "${a[#]}"
[[ expression ]] <snip> An additional binary operator, =~, is available, with the same precedence as == and !=. When it is used, the string to the right of the operator is considered an extended regular expression and matched accordingly (as in regex(3)). The return value is 0 if the string matches the pattern, and 1 otherwise. If the regular expression is syntactically incorrect, the conditional expression's return value is 2. If the shell option nocasematch is enabled, the match is performed without regard to the case of alphabetic characters. Any part of the pattern may be quoted to force it to be matched as a string. Substrings matched by parenthesized subexpressions within the regular expression are saved in the array variable BASH_REMATCH. The element of BASH_REMATCH with index 0 is the portion of the string matching the entire regular expression. The element of BASH_REMATCH with index n is the portion of the string matching the nth parenthesized subexpression
source: man bash
Goto the folder where you have *_DATE_*.*.sql.gz files and try below command
ls -ltr *.sql.gz|awk '{print $9}'|awk '/2020/{print $0}' |xargs rm
or
use
`ls -ltr |grep '2019-05-20'|awk '{print $9}'|xargs rm`
replace/2020/ with the pattern you want to delete. example 2020-05-01 replace as /2020-05-01/
Using two for loop
#!/bin/bash
shopt -s nullglob ##: This might not be needed but just in case
##: If there are no files the glob will not expand
latest=
allfiles=()
unwantedfiles=()
for file in *_????-??-??_*.sql.gz; do
if [[ $file =~ _([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2})_ ]]; then
allfiles+=("$file")
[[ $file > $latest ]] && latest=$file ##: The > is magical inside [[
fi
done
n=${#allfiles[#]}
if ((n <= 1)); then ##: No files or only one file don't remove it!!
printf '%s\n' "Found ${n:-0} ${allfiles[#]:-*sql.gz} file, bye!"
exit 0 ##: Exit gracefully instead
fi
for f in "${allfiles[#]}"; do
[[ $latest == $f ]] && continue ##: Skip the latest file in the loop.
unwantedfiles+=("$f") ##: Save all files in an array without the latest.
done
printf 'Deleting the following files: %s\n' "${unwantedfiles[*]}"
echo rm -rf "${unwantedfiles[#]}"
Relies heavily on the > test operator inside [[
You can create a new file with lower dates and should still be good.
The echo is there just to see what's going to happen. Remove it if you're satisfied with the output.
I'm actually using this script via cron now, except for the *.sql.gz part since I only have directories to match but the same date formant so I have, ????-??-??/ and only ([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}) as the regex pattern.
You can use my Python script "rotate-archives" for smart delete backups. (https://gitlab.com/k11a/rotate-archives).
An example of starting archives deletion:
rotate-archives.py test_mode=off age_from-period-amount_for_last_timeslot=7-5,31-14,365-180-5 archives_dir=/mnt/archives
As a result, there will remain archives from 7 to 30 days old with a time interval between archives of 5 days, from 31 to 364 days old with time interval between archives 14 days, from 365 days old with time interval between archives 180 days and the number of 5.
But require move _date_ to beginning file name or script add current date for new files.
I have a bash variable that looks like
aaa-bb-cccc-r17, m12w_pp_r2, z-r123, etc.
I am looking to extract everything up to the (final) -rNNN (any number of digits), or in other words, remove the final -rNNN. If the variable does not end in -r followed by a number, I want to leave it unchanged.
I tried ${the_variable%-r[0-9]*} but it turns out the * is the shell * ("match anything") rather than the regular expression * ("match any number of occurrences of previous element"). Using + instead of * ("one or more") matched nothing.
Any solution (along this line or any other)?
You can do this with extended pattern support.
$ shopt -s extglob
$ the_variable=aaa-bb-cccc-r17
$ echo "${the_variable%-r+([0-9])}"
aaa-bb-cccc
As you found parameter expansion can't, directly, do what you want here.
You could play games with stripping everything up to the last - in the value and then checking that the remaining string matches your desired pattern but at that point you might as well just do the pattern match directly and be done.
$ pat='(.*)-r[0-9]*$'
$ var='aaa-bb-cccc-r17, m12w_pp_r2, z-r123'
$ [[ $var =~ $pat ]] && var=${BASH_REMATCH[1]}
$ declare -p var
declare -- var="aaa-bb-cccc-r17, m12w_pp_r2, z"
$ var='aaa-bb-cccc-r17, m12w_pp_r2, z-r123g'
$ [[ $var =~ $pat ]] && var=${BASH_REMATCH[1]}
$ declare -p var
declare -- var="aaa-bb-cccc-r17, m12w_pp_r2, z-r123g"
I have a folder with files named as file_1.ext...file_90.ext. I can list a range of them with the following command:
$ ls /home/rasoul/myfolder/file_{6..19}.ext
but when I want to use this command inside a bash script, it doesn't work. Here is a minimal example:
#!/bin/bash
DIR=$1
st=$2
ed=$3
FILES=`ls ${DIR}/file\_\{$st..$ed\}.ext`
for f in $FILES; do
echo $f
done
running it as,
$ bash test_script.sh /home/rasoul/myfolder 6 19
outputs the following error message:
ls: cannot access /home/rasoul/myfolder/file_{6..19}.ext: No such file or directory
Brace expansion happens before variable expansion.
(Moreover, don't parse ls output.). You could instead say:
for f in $(seq $st $ed); do
echo "${DIR}/file_${f}.ext";
done
BASH always does brace expansion before variable expansion which is why ls is looking for a file /home/rasoul/myfolder/file_{6..19}.ext.
I personally use seq when I need to expand a number range that has variables in it. You could also use eval with echo to accomplish the same thing:
eval echo {$st..$ed}
But even if you used seq in your script, ls would not iterate over your range without a loop. If you want to check if files in the range exist, I would also avoid using ls here as you will get errors for every file in the range that doesn't exist. BASH can check if a file exists using -e.
Here is a loop that would check if a file exists within the range between variables $st and $ed and print it if it does:
for n in $(seq $st $ed); do
f="${DIR}/file_$n.ext"
if [ -e $f ]; then
echo $f
fi
done
The range pattern {A..B} does not accept variables for A or B. You need constants for them.
A workaround might be to start a subshell like this:
RESULT=$(bash -c "ls {$a..$b}")
Numeric ranges have to be literal numbers, you can't put variables in there. To do it you need to use eval:
FILES=`eval "ls ${DIR}/file_{$st..$ed}.ext"`
Here's a transcript of my test (I tried it in bash 4.1.5 and 3.2.48).
imac:testdir $ touch file_{1..30}.ext
imac:testdir $ st=6
imac:testdir $ ed=20
imac:testdir $ DIR=.
imac:testdir $ FILES=`eval "ls ${DIR}/file_{$st..$ed}.ext"`
imac:testdir $ echo "$FILES"
./file_10.ext
./file_11.ext
./file_12.ext
./file_13.ext
./file_14.ext
./file_15.ext
./file_16.ext
./file_17.ext
./file_18.ext
./file_19.ext
./file_20.ext
./file_6.ext
./file_7.ext
./file_8.ext
./file_9.ext
imac:testdir $
So, I'm setting up a bash script and want to parse arguments to certain flags using getopts. For a minimal example, consider the a script which has a flag, -M, and it takes y or n as an argument. If I use the following code:
#!/bin/bash
# minimalExample.sh
while getopts "M:" OPTION;
do
case ${OPTION} in
M)
RMPI=${OPTARG}
if ! [[ "$RMPI" =~ "^[yn]$" ]]
then
echo "-M must be followed by either y or n."
exit 1
fi
;;
esac
done
I get the following:
$ ./minimalExample.sh -M y
-M must be followed by either y or n.
FAIL: 1
However, if I use the following code instead
#!/bin/bash
# minimalExample2.sh
while getopts "M:" OPTION;
do
case ${OPTION} in
M)
RMPI=${OPTARG}
if [ -z $(echo $RMPI | grep -E "^[yn]$") ]
then
echo "-M must be followed by either y or n."
exit 1
else
echo "good"
fi
;;
esac
done
I get:
$ ./minimalExample2.sh -M y
good
Why doesn't minimalExample.sh work?
quoting regexp in this context forces a string comparison.
change to
if ! [[ "$RMPI" =~ ^[yn]$ ]]
check following post for more details,
bash regex with quotes?
Why do you need regex here at all? -M y is not the same as -M n, is it? So you definitely will use some statement (case or if) to distinguish one from another.
#!/bin/bash
while getopts "M:" OPTION; do
case ${OPTION} in
M)
case ${OPTARG} in
y)
# do what must be done if -M y
;;
n)
# do what must be done if -M n
;;
*)
echo >&2 "-M must be followed by either y or n."
exit 1
;;
;;
esac
done
Please note >&2 – error messages should be output to STDERR, not to STDOUT.