Access the word in the file with grep - regex

I have a conf file and I use grep to access the data in this file but not a very useful method for me.
How can I just get the main word by search-term?
I using:
grep "export:" /etc/VDdatas.conf
Print:
export: HelloWorld
I want: (without "export: ")
HelloWorld
How can I do that?

If you're using GNU grep you can use PCRE and a lookbehind:
grep -P -o '(?<=export:).*' /etc/VDdatas.conf
The -o option means to print only the part of the line that matches the regexp, and using a lookbehind for the export: prefix makes it not part of the match.
You can also use sed or awk
sed 's/export:/s/^export: //' /etc/VDdatas.conf
awk '/export:/ {print $2}' /etc/VDdatas.conf

I suggest you pipe the match to awk.
grep "export:" /etc/VDdatas.conf | awk -F ' ' '{print $2}'
This will print the second word in the output (after splitting the line on spaces).

Related

grep to match filename in full path

I want to "extract" the file name in a full path string using grep.
For example:
In /etc/network/interfaces I want to match "interfaces".
In /home/user/Documents/report.pdf I want to match "report.pdf".
Basically I want the opposite of:
$ ls /etc/network/interfaces | grep "^.*/"
I tried:
$ ls -p /etc/network/interfaces | grep "/.*$"
But it won't be the last slash (/), all chars (.*), until the end ($). Since slashes are chars as well, it matches all the path.
Does anyone know a way to match only the last part? Something like (from last slash until the end.
Thank you,
awk, getting the / separated last field:
% awk -F/ '{print $NF}' <<<'/etc/network/interfaces'
interfaces
% awk -F/ '{print $NF}' <<<'/home/user/Documents/report.pdf'
report.pdf
grep, getting the portion after last /:
% grep -o '[^/]\+$' <<<'/etc/network/interfaces'
interfaces
% grep -o '[^/]\+$' <<<'/home/user/Documents/report.pdf'
report.pdf
sed, replacing everything upto the last / with null:
% sed 's_.*/__' <<<'/etc/network/interfaces'
interfaces
% sed 's_.*/__' <<<'/home/user/Documents/report.pdf'
report.pdf
How about simply matching on not /? Also, for extraction, you need the -o flag to grep.
ls -p /etc/network/interfaces | grep -o '[^/]*$'
As said in the first comment you can directly try using awk something like:
ls -l /etc/networks | awk '{print $NF}' | awk -F "/" '{print $NF}'
It should be possible to trim these 2 awk pipes as well

Regex to match an IP adress within a colon and a slash with grep

The lines in the file I want to search look like this:
log:192.1.1.128/50098
log:192.1.1.11/22
...
Now I tried the following RegEx but none of them worked:
grep -oE "\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b" file
grep -oE "\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b"
grep -oE "\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b"
You can do this without regex using awk (on this simple example):
awk -F":|/" '{print $2}' file
192.1.1.128
192.1.1.11
To test if its IP contains three .:
awk -F":|/" '{n=split($2,a,".");if (n=4) print $2}' file
192.1.1.128
192.1.1.11
You could use grep also.
$ grep -oP '.*?:\K[^/]*(?=/)' file
192.1.1.128
192.1.1.11
Grep's extended regexp parameter -E won't support \d, you need to use [0-9] instead of \d.
$ grep -oE "\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\b" file
192.1.1.128
192.1.1.11

grep extract simple url - without scheme

I need to extract n url from a file. I've started with:
grep -E -o 'ftp://\S*' $filename
I know, that this particular url will start with ftp scheme and will end with some white character (space or newline).
I receive something like:
ftp:/dir/some_file.ext
But I need just a path (/dir/some_file.ext). Without scheme (ftp:// part)
Can I do it with the first regexp? Do I have to use a second one?
I cannot use anything else then grep/egrep.
If your grep supports -P (PCRE flag) then you can use:
grep -oP 'ftp:/\K/\S*' $filename
/dir/some_file.ext
If fore some reason you don't have grep -P available then pipe with another grep:
grep -oE 'ftp://\S*' file | grep -oE '/[^/].*'
/dir/some_file.ext
This gnu awk (due to multiple characters in Record Selector) may also do:
awk -v RS="ftp:/" 'NR>1 {print $1}' file

How to remove/strip double or single quote from a string?

I have a file with some lines like these:
ENVIRONMENT="myenv"
ENV_DOMAIN='mydomain.net'
LOGIN_KEY=mykey.pem
I want to extract the parts after the = but without the surrounding quotes. I tried with gsub like this:
awk -F= '!/^(#|$)/ && /^ENVIRONMENT=/ {gsub(/"|'/, "", $2); print $2}'
Which ends up with -bash: syntax error near unexpected token ')' error. It works just fine for single matching: /"/ or /'/ but doesn't work when I try match either one. What am I doing wrong?
If you are just trying to remove the punctuation then you can do it as below....
# remove all punctuation
awk -F= '{print $2}' n.dat | tr -d [[:punct:]]
# only remove single and double quotes
awk -F= '{print $2}' n.dat | tr -d \''"\'
explanation:
tr -d \''"\' is to delete any single and double quotes.
tr -d [[:punct:]] to delete all character from the punctuation class
Sample output as below from 2nd command above (without quotes):
myenv
mydomain.net
mykeypem
The problem is not with awk, but with bash. The single quote inside the gsub is closing the open quote so that bash is trying to parse the command awk with arguments !/^...gsub(/"|/,, ,, $2 and then an unmatched close paren. Try replacing the single quote with '"'"' (so that bash will properly terminate the string, then apply a single quote, then reopen another string.)
Is awk really a requirement? If not, why don't you use a simple sed command:
sed -rn -e "s/^[^#]+='(.*)'$/\1/p" \
-e "s/^[^#]+=\"(.*)\"$/\1/p" \
-e "s/^[^#]+=(.*)/\1/p" data
This might seems over engineered, but it works properly with embedded quotes:
sh$ cat data
ENVIRONMENT="myenv"
ENV_DOMAIN='mydomain.net'
LOGIN_KEY=mykey.pem
PASSWD="good ol'passwd"
sh$ sed -rn -e "s/^[^#]+='(.*)'/\1/p" -e "s/^[^#]+=\"(.*)\"/\1/p" -e "s/^[^#]+=(.*)/\1/p" data
myenv
mydomain.net
mykey.pem
good ol'passwd
You can use awk like this:
awk -F "=['\"]?|['\"]" '{print $2}' file
myenv
mydomain.net
mykey.pem
This will work with your awk
awk -F= '!/^(#|$)/ && /^ENVIRONMENT=/ {gsub(/"/,"",$2);gsub(q,"",$2); print $2}' q=\' file
It is the single quote in the expression that create problems. Add it to an variable and it will work.
I did the following:
awk -F"=\"|='|'|\"|=" '{print $2}' file
myenv
mydomain.net
mykey.pem
This tells awk to use either =", =', ' or " as field separator.
This is because the awk program must be enclosed in single quotes when run as a command line program. The program can be tripped up if a single quote is contained inside the script. Special tricks can be made to use single quotes as strings inside the program. See Shell-Quoting Issues in the GNU Awk Manual.
One trick is to save the match string as a variable:
awk -F\= -v s="'|\"" '{gsub(s, "", $2); print $2}' file
Output:
myenv
mydomain.net
mykey.pem

Remove everything after 2nd occurrence in a string in unix

I would like to remove everything after the 2nd occurrence of a particular
pattern in a string. What is the best way to do it in Unix? What is most elegant and simple method to achieve this; sed, awk or just unix commands like cut?
My input would be
After-u-math-how-however
Output should be
After-u
Everything after the 2nd - should be stripped out. The regex should also match
zero occurrences of the pattern, so zero or one occurrence should be ignored and
from the 2nd occurrence everything should be removed.
So if the input is as follows
After
Output should be
After
Something like this would do it.
echo "After-u-math-how-however" | cut -f1,2 -d'-'
This will split up (cut) the string into fields, using a dash (-) as the delimiter. Once the string has been split into fields, cut will print the 1st and 2nd fields.
This might work for you (GNU sed):
sed 's/-[^-]*//2g' file
You could use the following regex to select what you want:
^[^-]*-\?[^-]*
For example:
echo "After-u-math-how-however" | grep -o "^[^-]*-\?[^-]*"
Results:
After-u
#EvanPurkisher's cut -f1,2 -d'-' solution is IMHO the best one but since you asked about sed and awk:
With GNU sed for -r
$ echo "After-u-math-how-however" | sed -r 's/([^-]+-[^-]*).*/\1/'
After-u
With GNU awk for gensub():
$ echo "After-u-math-how-however" | awk '{$0=gensub(/([^-]+-[^-]*).*/,"\\1","")}1'
After-u
Can be done with non-GNU sed using \( and *, and with non-GNU awk using match() and substr() if necessary.
awk -F - '{print $1 (NF>1? FS $2 : "")}' <<<'After-u-math-how-however'
Split the line into fields based on field separator - (option spec. -F -) - accessible as special variable FS inside the awk program.
Always print the 1st field (print $1), followed by:
If there's more than 1 field (NF>1), append FS (i.e., -) and the 2nd field ($2)
Otherwise: append "", i.e.: effectively only print the 1st field (which in itself may be empty, if the input is empty).
This can be done in pure bash (which means no fork, no external process). Read into an array split on '-', then slice the array:
$ IFS=-
$ read -ra val <<< After-u-math-how-however
$ echo "${val[*]}"
After-u-math-how-however
$ echo "${val[*]:0:2}"
After-u
awk '$0 = $2 ? $1 FS $2 : $1' FS=-
Result
After-u
After
This will do it in awk:
echo "After" | awk -F "-" '{printf "%s",$1; for (i=2; i<=2; i++) printf"-%s",$i}'