unix: how to tell if a string matches a regex

unix: how to tell if a string matches a regex - regex

Trying out fish shell, so I'm translating my bash functions. The problem is that in one case, I'm using bash regexes to check if a string matches a regex. I can't figure out how to translate this into fish.
Here is my example.
if [[ "$arg" =~ ^[0-9]+$ ]]
...
I looked into sed, but I don't see a way to get it to set its exit status based on whether the regex matches.
I looked into delegating to Ruby, but again, getting the exit status set based on the match requires making this really ugly (see below).
I looked into delegating back to bash, but despite trying maybe three or four ways, never got that to match.
So, is there a way in *nix to check if a string matches a regex, so I can drop it into a conditional?
Here is what I have that currently works, but which I am unhappy with:
# kill jobs by job number, or range of job numbers
# example: k 1 2 5
# example: k 1..5
# example: k 1..5 7 10..15
# example: k 1-5 7 10-15
function k
for arg in $argv
if ruby -e "exit ('$arg' =~ /^[0-9]+\$/ ? 0 : 1)"
kill -9 %$arg
else
set _start (echo "$arg" | sed 's/[^0-9].*$//')
set _end (echo "$arg" | sed 's/^[0-9]*[^0-9]*//')
for n in (seq $_start $_end)
kill -9 %"$n"
end
end
end
end

The standard way is to use grep:
if echo "$arg" | grep -q -E '^[0-9]+$'
kill -9 %$arg

I would suggest to use the built-in string match subcommand
if string match -r -q '^[0-9]+$' $arg
echo "number!"
else
echo "not a number"
end

Starting from version 3.2, Bash natively supports regexs, see http://www.tldp.org/LDP/abs/html/bashver3.html#REGEXMATCHREF

Related

Check if any replacement done by `perl -i -pe`

In GNU sed, I can display the result of successful substitution of the search pattern. Simple example as the following:
echo -e "nginx.service\nmariadb.service\nphp-fpm.service" > something.conf;
sed -ri 's|(mariadb)(\.service)|postgresql-9.4\2|w sed-output.log' something.conf;
[[ -s sed-output.log ]] && echo "Pattern found and modified. $(cat sed-output.log)" || echo "Pattern not found.";
Because sed has limitation while dealing with multilines, I switched to perl.
echo -e "nginx.service\nmariadb.service\nphp-fpm.service" > something.conf;
perl -i -pe 's|(mariadb)(\.service)|postgresql-9.4\2|' something.conf;
The code above did the same like sed, but how can I get the modified content ("postgresql-9.4.service") into a file, or printed out?
Basically what I would like to achieve is, after the script has been executed, it tells me if it's successful (and what actually substituted) and if not, I'll display a message of what couldn't be found and replaced.
Edit:
Highlighted that I want to get (only-the-) modified content, which indicates that my script is successful. Because with perl -i -pe 's/pattern/replace/' file, I couldn't know if it return true or false. Of course I can simple do grep -E "/pettern/" to find out, but that's not the question.

This code will throw an exit code equal to 0 when replacement is done:
$ perl -i -pe '$M += s|(mariadb)(\.service)|postgresql-9.4\2|;END{exit 1 unless $M>0}' something.conf
$ echo $?
0
When NO substitution is done, return code will be 1:
$ perl -i -pe '$M += s|(maria)(\.service)|postgresql-9.4\2|;END{exit 1 unless $M>0}' something.conf
$ echo $?
1
From Perl documentation
An END code block is executed as late as possible, that is, after perl
has finished running the program and just before the interpreter is
being exited, even if it is exiting as a result of a die() function.
(But not if it's morphing into another program via exec, or being
blown out of the water by a signal--you have to trap that yourself (if
you can).) You may have multiple END blocks within a file--they will
execute in reverse order of definition; that is: last in, first out
(LIFO). END blocks are not executed when you run perl with the -c
switch, or if compilation fails.
Number of replacements returned from s operator
s/PATTERN/REPLACEMENT/msixpodualngcer
Searches a string for a pattern, and if found, replaces that pattern
with the replacement text and returns the number of substitutions
made.

It isn't as tidy in Perl because you have to open your log file explicitly, and for a one-liner that has to be in a BEGIN block. But Perl's s/// returns the number of changes made, so you can test it for truth
Note also that $2 is better than \2 in Perl, as the latter represents a character with code point 2, or Unicode U+0002 START OF TEXT
perl -i -pe ' BEGIN { open F, ">perl-output.log" } print F $_ if s|(mariadb)(\.service)|postgresql-9.4$2| ' something.conf

You can check the output directly if you only print the substituted lines:
if [[ -z $(sed -n 's/mariadb\(\.service\)/postgresql-9.4\1/p' something.conf) ]]; then
echo nope
fi

can i use regex in a bash shell script

I am a first year Computer technician student, and before now have never actually used linux. I grasped the basics of scripting fairly quickly and am trying to write a script that will create a directory and a soft link for each assignment. seeing as we average two assignments a week i thought this would be helpful.
I am having trouble making my script accept only numbers as variables. I have it mostly working (mostly) with use of 3 case statements, but would rather use basic regex with an if statement, if I can.
if [ $# != 1 ]; then
red='\033[0;31m'
NC='\033[0m'
blue='\033[1;34m'
NC='\033[0m'
echo 1>&2 "${blue}ERROR:${NC} $0 :${red}expecting only one variable, you gave $#($*)${NC}"
echo 1>&2 "${blue}Usage:${NC} $0 :${red}number of assignment.${NC}"
exit 2
fi
case $1 in
[[:punct:]]*) echo 1>&2 "Numeric Values Only"
exit 2
;;
[[:alpha:]]*) echo 1>&2 "Numeric Values Only"
exit 2
;;
[[:space:]]*) echo 1>&2 "Numeric Values Only"
exit 2
;;
esac
the script then makes the directory and creates a soft link for the marking script (if its posted), and ends. can anyone help me shorten/eliminate the case statements

You cannot use regular expressions in portable shell scripts (ones that will run on any POSIX compliant shell). In general, patterns in the shell are globs, not regular expressions.
That said, there are a few other options. If you are using Bash specifically, you have two choices; you can use extended globs, which give you some regex like functionality:
shopt -s extglob
case $1 in
+([[:digit:]]) ) echo "digits" ;;
*) echo "not digits" ;;
esac
Another option is that Bash has the =~ operator in the [[ ]] conditional construct to match strings against a regex; this doesn't work in a case statement, but works in an if:
if [[ $1 =~ [0-9]+ ]]
then
echo "digits"
fi
Finally, if you want to do regular expression matching portably (so it will run in other POSIX shells like ash, dash, ksh, zsh, etc), you can call out to grep; if you pass -q, it will be silent about the matches but return success or failure depending on whether the input matched:
if echo "$1" | grep -qE "[0-9]+"
then
echo "digits"
fi

How to get a part of a string with a regular expression in a /bin/sh script

I need to extract the part of a string in a shell script. The original string is pretty complicated, so I really need a regular expression to select the right part of the original string - justing removing a prefix and suffix won't work. Also, the regular expression needs to check the context of the string I want to extract, so I e.g. need a regular expression a\([^b]*\)b to extract 123 from 12a123b23.
The shell script needs to be portable, so I cannot make use of the Bash constructs [[ and BASH_REMATCH.
I want the script to be robust, so when the regular expression does not match, the script should notice this e.g. through a non-zero exit code of the command to be used.
What is a good way to do this?
I've tried various tools, but none of them fully solved the problem:
expr match "$original" ".*$regex.*" works except for the error case. With this command, I don't know how to detect if the regex did not match. Also, expr seems to take the extracted string to determine its exit code - so when I happened to extract 00, expr had an exit code of 1. So I would need to generally ignore the exit code with expr match "$original" ".*$regex.*" || true
echo "$original" | sed "s/.*$regex.*/\\1/" also works except for the error case. To handle this case, I'd need to test if I got back the original string, which is also quite unelegant.
So, isn't there a better way to do this?

You could use the -n option of sed to suppress output of all input lines and add the p option to the substitute command, like this:
echo "$original" | sed -n -e "s/.*$regex.*/\1/p"
If the regular expression matches, the matched group is printed as before. But now if the regular expression does not match, nothing is printed and you will need to test only for the empty string.

How about grep -o the only possible problem is portability, otherwise it satisfies all requirements:
➜ echo "hello and other things" | grep -o hello
hello
➜ echo $?
0
➜ echo "hello and other things" | grep -o nothello
➜ echo $?
1
One of the best things is that since it's grep you can pick what regex's you want whether BRE, ERE or Perl.

if egrep is available (pretty much all time)
egrep 'YourPattern' YourFile
or
egrep "${YourPattern}" YourFile
if only grep is available
grep -e 'YourPattern' YourFile
you check with a classical [ $? -eq 0 ] for the status of the command (also take into account bad YourFile access)
for the content itself, extract with sed or awk (for portability issue) (after the failure test)
Content="$( sed -n -e "s/.*\(${YourPattern}\).*/\1/p;q" )"

regexp (sed) suppress "no match" output

I'm stuck on that and can't wrap my head around it: How can I tell sed to return the value found, and otherwise shut up?
It's really beyond me: Why would sed return the whole string if he found nothing? Do I have to run another test on the returned string to verify it? I tried using "-n" from the (very short) man page but it effectively suppresses all output, including matched strings.
This is what I have now :
echo plop-02-plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/'
which returns
02 (and that is fine and dandy, thank you very much), but:
echo plop-02plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/'
returns
plop-02plop (when it should return this = "" nothing! Dang, you found nothing so be quiet!
For crying out loud !!)
I tried checking for a return value, but this failed too ! Gasp !!
$ echo plop-02-plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/' ; echo $?
02
0
$ echo plop-02plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/' ; echo $?
plop-02plop
0
$
This last one I cannot even believe. Is sed really the tool I should be using? I want to extract a needle from a haystack, and I want a needle or nothing..?

sed by default prints all lines.
What you want to do is
/patt/!d;s//repl/
IOW delete lines not matching your pattern, and if they match, extract particular element from it, giving capturing group number for instance. In your case it will be:
sed -e '/^.*\(.\)\([0-9][0-9]\)\1.*$/!d;s//\2/'
You can also use -n option to suppress echoing all lines. Then line is printed only when you explicitly state it. In practice scripts using -n are usually longer and more cumbersome to maintain. Here it will be:
sed -ne 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/p'
There is also grep, but your example shows, why sed is sometimes better.

Perhaps you can use egrep -o?
input.txt:
blooody
aaaa
bbbb
odor
qqqq
E.g.
sehe#meerkat:/tmp$ egrep -o o+ input.txt
ooo
o
o
sehe#meerkat:/tmp$ egrep -no o+ input.txt
1:ooo
4:o
4:o
Of course egrep will have slightly different (better?) regex syntax for advanced constructs (back-references, non-greedy operators). I'll let you do the translation, if you like the approach.

Bash regex for strong password

How can I use the following regex in a BASH script?
(?=^.{8,255}$)((?=.*\d)(?!.*\s)(?=.*[A-Z])(?=.*[a-z]))^.*
I need to check the user input(password) for the following:
at least one Capital Letter.
at least one number.
at least one small letter.
and the password should be between 8 and 255 characters long.

If your version of grep has the -P option it supports PCRE (Perl-Compatible Regular Expressions.
grep -P '(?=^.{8,255}$)(?=^[^\s]*$)(?=.*\d)(?=.*[A-Z])(?=.*[a-z])'
I had to change your expression to reject spaces since it always failed. The extra set of parentheses didn't seem necessary. I left off the ^.* at the end since that always matches and you're really only needing the boolean result like this:
while ! echo "$password" | grep -P ...
do
read -r -s -p "Please enter a password: " password
done

I'm don't think that your regular expression is the best (or correct?) way to check the things on your list (hint: I'd check the length independently of the other conditions), but to answer the question about using it in Bash: use the return value of grep -Eq, e.g.:
if echo "$candidate_password" | grep -Eq "$strong_pw_regex"; then
echo strong
else
echo weak
fi
Alternatively in Bash 3 and later you can use the =~ operator:
if [[ "$candidate_password" =~ "$strong_pw_regex" ]]; then
…
fi
The regexp syntax of grep -E or Bash does not necessarily support all the things you are using in your example, but it is possible to check your requirements with either. But if you want fancier regular expressions, you'll probably need to substitute something like Ruby or Perl for Bash.
As for modifying your regular expression, check the length with Bash (${#candidate_password} gives you the length of the string in the variable candidate_password) and then use a simple syntax with no lookahead. You could even check all three conditions with separate regular expressions for simplicity.

These matches are connected with the logical AND operator, which means the only good match is when all of them match.
Therefore the simplest way is to match those conditions chained, with the previous result piped into the next expression. Then if any of the matches fail, the entire expression fails:
$echo "tEsTstr1ng" | egrep "^.{8,255}"| egrep "[ABCDEFGHIJKLMNOPQRSTUVWXYZ]"| egrep "[abcdefghijklmnopqrstuvwxyz"] | egrep "[0-9]"
I manually entered all characters instead of "[A-Z]" and "[a-z]" because different system locales might substitute them as [aAbBcC..., which is two conditions in one match and we need to check for both conditions.
As shell script:
#!/bin/sh
a="tEsTstr1ng"
b=`echo $a | egrep "^.{8,255}" | \
egrep "[ABCDEFGHIJKLMNOPQRSTUVWXYZ]" | \
egrep "[abcdefghijklmnopqrstuvwxyz"] | \
egrep "[0-9]"`
# now featuring W in the alphabet string
#if the result string is empty, one of the conditions has failed
if [ -z $b ]
then
echo "Conditions do not match"
else
echo "Conditions match"
fi

grep with -E option uses the Extended regular expression(ERE)From this documentation ERE does not support look ahead.
So you can use Perl for this as:
perl -ne 'exit 1 if(/(?=^.{8,255}$)((?=.*\\d)(?=.*[A-Z])(?=.*[a-z])|(?=.*\\d)(?=.*[^A-Za-z0-9])(?=.*[a-z])|(?=.*[^A-Za-z0-9])(?=.*[A-Z])(?=.*[a-z])|(?=.*\\d)(?=.*[A-Z])(?=.*[^A-Za-z0-9]))^.*/);exit 0;'
Ideone Link

I get that you are looking for regex, but have you consider doing it through PAM module?
dictionary
quality
There might be other interesting modules.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

unix: how to tell if a string matches a regex - regex

The standard way is to use grep: if echo "$arg" | grep -q -E '^[0-9]+$' kill -9 %$arg

I would suggest to use the built-in string match subcommand if string match -r -q '^[0-9]+$' $arg echo "number!" else echo "not a number" end

Starting from version 3.2, Bash natively supports regexs, see http://www.tldp.org/LDP/abs/html/bashver3.html#REGEXMATCHREF

Related

Check if any replacement done by `perl -i -pe`

can i use regex in a bash shell script

How to get a part of a string with a regular expression in a /bin/sh script

regexp (sed) suppress "no match" output

Bash regex for strong password

Categories

Resources