Parse Args with R.E - regex

can you help me?
I want parse this: {'$', '$0', '$qwerty', '$123'} # $Previous_Character_Or_Group_Repeated_0_Or_More_Time
In ScriptShell:
echo "$" | grep '^\$.*$'
$
it's work.
echo "$1" | grep '^\$.*$'
echo "$hello" | grep '^\$.*$'
echo "$Qwerty123" | grep '^\$.*$'
it's doesn't work.
thx for reply,

Ok, Just use single quote, not double quote like this:
echo '$1' | grep '^\$.*$'
$1

Related

How can I extract the timestamp from the end of a shell variable when the format isn't fixed?

I'm trying to extract the timestamp from the end of a shell variable like this:
Input=AEXP_CSTONE_EU_prpbdp_sourcefile_yyyymmddhhmmss.txt
TimeStamp=`echo $Input | awk -F"_" '{print $6}'`
This works for this particular case, but the format of the string can change. For example, it could also be:
Input=AEXP_CSTONE_EU_prpbdp_sourcefile_prospects_yyyymmddhhmmss.txt
The variable will always end with yyyymmddhhmmss.txt. How can I extract the timestamp consistently?
Given:
$ echo $Input
AEXP_CSTONE_EU_prpbdp_sourcefile_prospects_20151116141111.txt
You can use sed:
$ echo $Input | sed -n 's|.*_\([0-9]\{14\}\)\.txt|\1|p'
20151116141111
Or nested grep:
$ echo $Input | grep -Eo '_[0-9]{14}\.txt' | grep -Eo '[0-9]{14}'
20151116141111
awk:
$ echo $Input | awk -F_ '{split($NF, a, "."); print a[1]}'
20151116141111
Perl
$ echo $Input | perl -ne 'print $1 if /_(\d{14})\.txt/'
20151116141111
cut and rev:
$ echo $Input | rev | cut -d'_' -f 1 | rev | cut -d'.' -f 1
20151116141111
Bash:
$ last=${Input##*_}
$ echo $last
20151116141111.txt
$ ts=${last%.*}
$ echo $ts
20151116141111
In summary, lots of ways...
If you don't want to loose the .txt part, even easier:
$ echo $Input | sed -n 's|.*_\([0-9]\{14\}\.txt\)|\1|p'
20151116141111.txt
$ echo $Input | grep -Eo '[0-9]{14}\.txt$'
20151116141111.txt
$ echo $Input | awk -F_ '{print $NF}'
20151116141111.txt
$ echo $Input | perl -ne 'print $1 if /_(\d{14}\.txt)/'
20151116141111.txt
$ echo $Input | rev | cut -d'_' -f 1 | rev
20151116141111.txt
$ last=${Input##*_}
$ echo $last
20151116141111.txt
You need to match the part that will not change then:
TimeStamp=$(echo $Input | perl -pe 's/.*(\d{14})\.txt/$1/')
You are extracting the 6th field separated by _, yet it seems you really want to extract the last field. You can do that with parameter expansion:
timestamp=${Input##*_}
timestamp=${timestamp%.txt}
See BashFAQ 100 for more on string manipulation in bash.
In awk, you'd use $NF to reference the last field, though awk is overkill for this.

How to display part of matched pattern in grep?

I wanted to extract 12 from a text like "abc_12_1". I am trying like this
echo "abc_12_1" | grep -Eo '[a-zA-Z]+_[0-9]+_1'
abc_12_1
But I am not able to select the digit after first _ in string, the output of above command is whole string. I am looking for some alternative in grep which I have in following Perl pattern matching.
perl -e '"abc_55_1" =~ m/[a-zA-Z]+_([0-9]+)_1/ ; print $1'
55
Is it possible with grep?
Using perl:
$ echo "abc_12_1" | perl -lne 'print /_(\d+)_/'
12
or grep:
$ echo "abc_12_1" | grep -oP '(?<=_)\d+(?=_)'
12
You could use cut:
cut -d_ -f2 <<< "abc_12_1"
Using grep:
grep -oP '(?<=_).*?(?=_)' <<< "abc_12_1"
Both would yield 12.
One way is to use awk
echo "abc_12_1" | awk -F_ '{print $2}'
12
Or grep
echo "abc_12_1" | grep -o "[0-9][0-9]"
12
Using grep with extended regex
grep -oE "[0-9]{2}" # Get only hits with two digits
grep -oE "[0-9]{2,}" # Get hits with two or more digits

How to use sed to identify a string in brackets?

I want to find the string in that is placed with in the brackets. How do I use sed to pull the string?
# cat /sys/block/sdb/queue/scheduler
noop anticipatory deadline [cfq]
I'm not getting the exact result
# cat /sys/block/sdb/queue/scheduler | sed 's/\[*\]//'
noop anticipatory deadline [cfq
I'm expecting an output
cfq
It can be easier with grep, if it happens to be changing the position in which the text in between brackets is located:
$ grep -Po '(?<=\[)[^]]*' file
cfq
This is look-behind: whenever you find a string [, start fetching all the characters up to a ].
See another example:
$ cat a
noop anticipatory deadline [cfq]
hello this [is something] we want to [enclose] yeah
$ grep -Po '(?<=\[)[^]]*' a
cfq
is something
enclose
You can also use awk for this, in case it is always in the same position:
$ awk -F[][] '{print $2}' file
cfq
It is setting the field separators as [ and ]. And from that, prints the second one.
And with sed:
$ sed 's/[^[]*\[\([^]]*\).*/\1/g' file
cfq
It is a bit messy, but basically it is looking from the block of text in between [] and prints it back.
I found one possible solution-
cut -d "[" -f2 | cut -d "]" -f1
so the exact solution is
# cat /sys/block/sdb/queue/scheduler | cut -d "[" -f2 | cut -d "]" -f1
Another potential solution is awk:
s='noop anticipatory deadline [cfq]'
awk -F'[][]' '{print $2}' <<< "$s"
cfq
Another way by gnu grep :
grep -Po "\[\K[^]]*" file
with pure shell:
while read line; do [[ "$line" =~ \[([^]]*)\] ]] && echo "${BASH_REMATCH[1]}"; done < file
Another awk
echo 'noop anticipatory deadline [cfq]' | awk '{gsub(/.*\[|\].*/,x)}8'
cfq
perl -lne 'print $1 if(/\[([^\]]*)\]/)'
Tested here

Can not extract the capture group with either sed or grep

I want to extract the value pair from a key-value pair syntax but I can not.
Example I tried:
echo employee_id=1234 | sed 's/employee_id=\([0-9]+\)/\1/g'
But this gives employee_id=1234 and not 1234 which is actually the capture group.
What am I doing wrong here? I also tried:
echo employee_id=1234| egrep -o employee_id=([0-9]+)
but no success.
1. Use grep -Eo: (as egrep is deprecated)
echo 'employee_id=1234' | grep -Eo '[0-9]+'
1234
2. using grep -oP (PCRE):
echo 'employee_id=1234' | grep -oP 'employee_id=\K([0-9]+)'
1234
3. Using sed:
echo 'employee_id=1234' | sed 's/^.*employee_id=\([0-9][0-9]*\).*$/\1/'
1234
To expand on anubhava's answer number 2, the general pattern to have grep return only the capture group is:
$ regex="$precedes_regex\K($capture_regex)(?=$follows_regex)"
$ echo $some_string | grep -oP "$regex"
so
# matches and returns b
$ echo "abc" | grep -oP "a\K(b)(?=c)"
b
# no match
$ echo "abc" | grep -oP "z\K(b)(?=c)"
# no match
$ echo "abc" | grep -oP "a\K(b)(?=d)"
Using awk
echo 'employee_id=1234' | awk -F= '{print $2}'
1234
use sed -E for extended regex
echo employee_id=1234 | sed -E 's/employee_id=([0-9]+)/\1/g'
You are specifically asking for sed, but in case you may use something else - any POSIX-compliant shell can do parameter expansion which doesn't require a fork/subshell:
foo='employee_id=1234'
var=${foo%%=*}
value=${foo#*=}
 
$ echo "var=${var} value=${value}"
var=employee_id value=1234

Shell scripting using grep to split a string

I have a variable in my shell script of the form
myVAR = "firstWord###secondWord"
I would like to use grep or some other tool to separate into two variables such that the final result is:
myFIRST = "firstWord"
mySECOND = "secondWord"
How can I go about doing this? #{3} is what I want to split on.
Using substitution with sed:
echo $myVAR | sed -E 's/(.*)#{3}(.*)/\1/'
>>> firstword
echo $myVAR | sed -E 's/(.*)#{3}(.*)/\2/'
>>> secondword
# saving to variables
myFIRST=$(echo $myVAR | sed -E 's/(.*)#{3}(.*)/\1/')
mySECOND=$(echo $myVAR | sed -E 's/(.*)#{3}(.*)/\2/')
The best tool for this is sed :
$ echo "firstWord###secondWord" | sed 's#####\
#'
firstWord
secondWord
A complete example :
$ read myFIRST mySECOND < <(echo "$myvar" | sed 's##### #')
$ echo $myFIRST
firstWord
$ echo $mySECOND
secondWord
$ STR='firstWord###secondWord'
$ eval $(echo $STR | sed 's:^:V1=":; /###/ s::";V2=": ;s:$:":')
$ echo $V1
firstWord
$ echo $V2
secondWord
This is how I would do it with zsh:
myVAR="firstWord###secondWord"
<<<$myvar sed 's/###/ /' | read myFIRST mySECOND