Print a part of string regex bash - regex

From this content (in a file):
myspecificBhost.fqdn.com myspecificaBhost.fqdn.com myspecificzBhost.fqdn.com
I need to print the next 4 characters from the "B":
Bhost
I tried:
echo ${var:position1:lenght}
but position 1 is never equal

Using BASH regex:
s='myspecificBhost.fqdn.com myspecificaBhost.fqdn.com myspecificzBhost.fqdn.com'
[[ "$s" =~ (B[a-z][a-z][a-z][a-z]) ]] && echo "${BASH_REMATCH[1]}"
Bhost

try sed command:
sed -nr '/.*c(.{4,6}).*/s//\1/p' input.txt | cut -c2-6
RESULT:
Bhost
With grep command:
cat input.txt | grep -o B.... | head -1
RESULT:
Bhost

Try with:
cat file | grep -o B....

Bash using parameter substitution. Outputs the 4 characters
after the first 'B':
text='myspecificBhost.fqdn.com myspecificaBhost.fqdn.com myspecificzBhost.fqdn.com'
text=${text#*B}
text=${text:0:4}
echo "${text}"
Output:
host
To get the leading 'B' use
echo "B${text}"

Related

sed not able to print matching regex grop only

I have some key value pair arguments. I need to print them as is.
Example.
echo $X
(a=b) (c=d) (e=f)
echo "$X" | sed -E 's/([a-zA-Z0-9_]*=[a-zA-Z0-9_]*)/match/1'
echo "$X" | sed -E 's/([a-zA-Z0-9_]*=[a-zA-Z0-9_]*)/\1/1'
echo "$X" | sed -E 's/([a-zA-Z0-9_]*=[a-zA-Z0-9_]*)/\1/2'
echo "$X" | sed -E 's/([a-zA-Z0-9_]*=[a-zA-Z0-9_]*)/\1/3'
Post the above expresion, I wanted to print matching groups one by one. using .* in pattern matching is greedy and is printing either first or last matching groups only. How can I print any matching group in this way.
Here is my expected output.
a=b
c=d
e=f
This grep one-liner will do:
grep -o '[^(]*=[^)]*'
example:
kent$ grep -o '[^(]*=[^)]*' <<<'(a=b) (c=d) (e=f)'
a=b
c=d
e=f
Replace ) ( with a newline and remove the remaining parentheses.
echo "$X" | sed 's/) (/\n/g;s/[()]//g'
To print the $nth line, you can pipe the output to
sed -n "$n p"

Pulling Single digit out of string --bash

Example
./test.sh R19
echo "$1" > test.txt
cat test.txt | grep -o ^[A-Z] > model.txt
cat test.txt | grep -o [0-9] > num1.txt
cat test.txt | grep -o [0-9]$ > num2.txt
echo "$(cat model.txt)00$(cat num1.txt)00$(cat num2.txt)"
Im expecting to see R001009, however what i get is
R001
9009
So how can i get it so my num1.txt only recieves the middle number and not both?
That's because grep -o '[0-9]' is returning all the digits on separate lines.
The painful way would be cat test.txt | grep -o [0-9] | head -1 > num1.txt
But don't do that: you're doing way too much file I/O. Use a regex in bash:
if [[ $1 =~ ^([A-Z])([0-9])([0-9])$ ]]; then
printf "%s00%d00%d\n" "${BASH_REMATCH[#]:1}"
fi
Make sure you're using #!/bin/bash as your shebang line.
$ set -- R19
$ if [[ $1 =~ ^([A-Z])([0-9])([0-9])$ ]]; then
> printf "%s00%d00%d\n" "${BASH_REMATCH[#]:1}"
> fi
R001009

bash - Extract part of string

I have a string something like this
xsd:import schemaLocation="AppointmentManagementService.xsd6.xsd" namespace=
I want to extract the following from it :
AppointmentManagementService.xsd6.xsd
I have tried using regex, bash and sed with no success. Can someone please help me out with this?
The regex that I used was this :
/AppointmentManagementService.xsd\d{1,2}.xsd/g
Your string is:
nampt#nampt-desktop:$ cat 1
xsd:import schemaLocation="AppointmentManagementService.xsd6.xsd" namespace=
Try with awk:
cat 1 | awk -F "\"" '{print $2}'
Output:
AppointmentManagementService.xsd6.xsd
sed doesn't recognize \d, use [0-9] or [[:digit:]] instead:
sed 's/^.*schemaLocation="\([^"]\+[[:digit:]]\{1,2\}\.xsd\)".*$/\1/g'
## or
sed 's/^.*schemaLocation="\([^"]\+[0-9]\{1,2\}\.xsd\)".*$/\1/g'
You can use bash native regex matching:
$ in='xsd:import schemaLocation="AppointmentManagementService.xsd6.xsd" namespace='
$ if [[ $in =~ \"(.+)\" ]]; then echo "${BASH_REMATCH[1]}"; fi
Output:
AppointmentManagementService.xsd6.xsd
Based on your example, if you want to grant, at least, 1 or, at most, 2 digits in the .xsd... component, you can fine tune the regex with:
$ if [[ $in =~ \"(AppointmentManagementService.xsd[0-9]{1,2}.xsd)\" ]]; then echo "${BASH_REMATCH[1]}"; fi
using PCRE in GNU grep
grep -oP 'schemaLocation="\K.*?(?=")'
this will output pattern matched between schemaLocation=" and very next occurrence of "
Reference:
https://unix.stackexchange.com/a/13472/109046
Also we can use 'cut' command for this purpose,
[root#code]# echo "xsd:import schemaLocation=\"AppointmentManagementService.xsd6.xsd\" namespace=" | cut -d\" -f 2
AppointmentManagementService.xsd6.xsd
s='xsd:import schemaLocation="AppointmentManagementService.xsd6.xsd" namespace='
echo $s | sed 's/.*schemaLocation="\(.*\)" namespace=.*/\1/'

How to extract a number from a string using grep and regex

I make a cat of a file and apply on it a grep with a regular expression like this
cat /tmp/tmp_file | grep "toto.titi\[[0-9]\+\].tata=55"
the command display the following output
toto.titi[12].tata=55
is it possible to modify my grep command in order to extract the number 12 as displayed output of the command?
You can grab this in pure BASH using its regex capabilities:
s='toto.titi[12].tata=55'
[[ "$s" =~ ^toto.titi\[([0-9]+)\]\.tata=[0-9]+$ ]] && echo "${BASH_REMATCH[1]}"
12
You can also use sed:
sed 's/toto.titi\[\([0-9]*\)\].tata=55/\1/' <<< "$s"
12
OR using awk:
awk -F '[\\[\\]]' '{print $2}' <<<"$s"
12
use lookahead
echo toto.titi[12].tata=55|grep -oP '(?<=\[)\d+'
12
without perl regex,use sed to replace "["
echo toto.titi[12].tata=55|grep -o "\[[0-9]\+"|sed 's/\[//g'
12
Pipe it to sed and use a back reference:
cat /tmp/tmp_file | grep "toto.titi\[[0-9]\+\].tata=55" | sed 's/.*\[(\d*)\].*/\1/'

Can not extract the capture group with either sed or grep

I want to extract the value pair from a key-value pair syntax but I can not.
Example I tried:
echo employee_id=1234 | sed 's/employee_id=\([0-9]+\)/\1/g'
But this gives employee_id=1234 and not 1234 which is actually the capture group.
What am I doing wrong here? I also tried:
echo employee_id=1234| egrep -o employee_id=([0-9]+)
but no success.
1. Use grep -Eo: (as egrep is deprecated)
echo 'employee_id=1234' | grep -Eo '[0-9]+'
1234
2. using grep -oP (PCRE):
echo 'employee_id=1234' | grep -oP 'employee_id=\K([0-9]+)'
1234
3. Using sed:
echo 'employee_id=1234' | sed 's/^.*employee_id=\([0-9][0-9]*\).*$/\1/'
1234
To expand on anubhava's answer number 2, the general pattern to have grep return only the capture group is:
$ regex="$precedes_regex\K($capture_regex)(?=$follows_regex)"
$ echo $some_string | grep -oP "$regex"
so
# matches and returns b
$ echo "abc" | grep -oP "a\K(b)(?=c)"
b
# no match
$ echo "abc" | grep -oP "z\K(b)(?=c)"
# no match
$ echo "abc" | grep -oP "a\K(b)(?=d)"
Using awk
echo 'employee_id=1234' | awk -F= '{print $2}'
1234
use sed -E for extended regex
echo employee_id=1234 | sed -E 's/employee_id=([0-9]+)/\1/g'
You are specifically asking for sed, but in case you may use something else - any POSIX-compliant shell can do parameter expansion which doesn't require a fork/subshell:
foo='employee_id=1234'
var=${foo%%=*}
value=${foo#*=}
 
$ echo "var=${var} value=${value}"
var=employee_id value=1234