Parse the output of my bash script and save as CSV - regex

I have a bash script that SSH'es to a list of servers (given a .txt file), runs another script inside each server, and shows the results. But I need to parse the verbose data from output, and eventually save some meaningful results as a .CSV file.
Here is my main script:
set +e
while read line
do
ssh myUser#"$line" -t 'sudo su /path/to/script.sh' < /dev/null
done < "/home/listOfServers.txt"
where the listOfServers.txt is like
server1
server2
server3
The output of running my script looks like this, showing the results for each servers one after another.
SNAME:WORKFLOW_APS_001 |10891 | Alive:2018-06-18:06:54 |TCP
SNAME:WORKFLOW_APSWEB_001 |11343 | Alive:2018-06-18:06:54 |TCP
Processes in Instance: WORKFLOW_OHS_002
WORKFLOW_OHS_002 | OHS | 8925 | Alive | 852960621 | 1367120 | 510:11:51 | http:9881
Processes in Instance: WORKFLOW_OHS_003
WORKFLOW_OHS_003 | OHS | 9187 | Alive | 2041606684 | 1367120 | 510:11:51 | http:9883
SNAME:WORKFLOW_RPSF_001 |10431 | Alive:2018-06-18:06:55 |TCP
SNAME:WORKFLOW_SCPTL_001 |9788 | Alive:2018-06-18:06:55 |TCP
...
From this output, I only need the OHS names and their status, and save along with the original server's name as a CSV. The pattern to me looks like this: I need to look at each line, and if the line doesn't contain "Processes in Instance" or "SNAME", then split based on space, and grab the 1st (OHS name) and 4th field (status). So my CSV will look like:
server1, WORKFLOW_OHS_002, Alive
server1, WORKFLOW_OHS_003, Alive
server2, .....
...
How can I modify my bash to do this?

You can use awk:
while read -r line; do
ssh myUser#"$line" -t 'sudo su /path/to/script.sh' < /dev/null |
awk -v s="$line" -F '|' -v OFS=', ' '!/^[[:blank:]]*SNAME:/ && NF>2 {
gsub(/^[[:blank:]]+|[[:blank:]]+$/, "");
gsub(/[[:blank:]]*\|[[:blank:]]*/, "|");
print s, $1, $4
}'
done < "/home/listOfServers.txt"
EDIT: As per your comment below, you can do this to handle error conditions:
while read -r line; do
out=$(ssh myUser#"$line" -t 'sudo su /path/to/script.sh' < /dev/null 2>&1)
if [[ -z $out ]]; then
echo "$line, NULL, NULL"
elif [[ $out == *"timed out"* ]]; then
echo "$line, FAIL, FAIL"
else
awk -v s="$line" -F '|' -v OFS=', ' '!/^[[:blank:]]*SNAME:/ && NF>2 {
gsub(/^[[:blank:]]+|[[:blank:]]+$/, "");
gsub(/[[:blank:]]*\|[[:blank:]]*/, "|");
print s, $1, $4
}' <<< "$out"
fi
done < "/home/listOfServers.txt"

Something to try - and good luck.
while read line
do ssh myUser#"$line" -t 'sudo su /path/to/script.sh' < /dev/null |
sed -E "/Processes in Instance/d; /SNAME/d;
s/^ *([^| ]*) *[|][^|]*[|][^|]*[|] *([^| ]*).*/$line,\1,\2/;"
done < "/home/listOfServers.txt"
You ought to be able to improve on that. :)

Related

What gcloud command would list all the compute instances (for all gcp projects) with a network tag that contains a specific string?

I have put together the code below to find all resources with a network tag that contains -allowaccess however it doesn't seem to work...
for i in $(gcloud projects list | awk NR>1); do gcloud compute instances list --filter="tags.items:-allowaccess --project=$i; done
Any ideas?
A colleague of mine figured it out...here's the command - hope it's useful to others!
for i in $(gcloud projects list | awk '{print $1}' | awk 'NR>1'); do echo PROJECT: $i && echo "--" && gcloud compute instances list --project=$i --filter="(tags.items:allowaccess)" && echo ""; done
For each project, this outputs each VM with a network tag that contains the text 'allow access'
Try something alike --filter="label:(*allowaccess)" or --filter="labels.*allowaccess:*", because these are generally instance labels. See gcloud topic filters.
I think the code self explain :)
# indice of .csv
echo "project;machine;region;family;value1;value2;value3;value4;value5" >> export.csv
# loop projects
for p in $(gcloud projects list | awk '{print $1}' | awk 'NR>1')
do
# loop values of instance
for i in $(gcloud compute instances list --project=${p} | grep -v "TERMINATED" | grep -v "NAME")
do
if [ "${i}" == "RUNNING" ]
then
echo ${instance}
X=0
elif [[ $X -eq 0 ]]
then
echo -n ${i}
echo -n ";"
echo -n ${i}
echo -n ";"
X=$((X+1))
else
echo -n ${i}
echo -n ";"
X=$((X+1))
fi
done
done >> export.csv
# remove wrong ;
sed -i 's/,;/ /g' export.csv
sed -i 's/;vCPU/ vCPU/g' export.csv
sed -i 's/;GiB/ GiB/g' export.csv

How to use sed to replace strings in piped input with command outputs using regular expressions

I'm trying to use sed to properly parse the output of auditd records. These have encoded hex, long timestamps and UID/AUID which I need to decode/translate via commands.
I am using pipes as I have to ship this across to the system journal.
I have gotten this far:
sed -r "s#msg=([A-Z0-9]*)\$#msg=$(xxd -r -p <<< \1)#"
Sample input:
[IRRELEVANT STUFF] msg=7468697320697320612073616D706C652074657374
The problem is that sed is not "unfolding" the capture group and the nested command receives the wrong argument/input.
xxd should be receiving 7468697320697320612073616D706C652074657374 and not \1
I have tested this in isolated fashion and that is indeed what is happening.
HELP! Thanks!!
sed is for doing s/old/new, that is all, for anything else just use awk. With GNU awk for the 3rd arg to match():
awk '
match($0,/(.*msg=)([[:alnum:]]+)$/,a) {
cmd = "xxd -r -p <<< " a[2]
$0 = a[1] ((cmd | getline line) > 0 ? line : "ERROR")
close(cmd)
}
{ print }
'
The above is assuming the syntax for calling your xxd command is exactly what you had in your sed script. I don't have xxd on my system but here's using wc -c instead to show how the script works:
$ wc -c <<< 7468697320697320612073616D706C652074657374
43
$ awk '
match($0,/(.*msg=)([[:alnum:]]+)$/,a) {
cmd = "wc -c <<< " a[2]
$0 = a[1] ((cmd | getline line) > 0 ? line : "ERROR")
close(cmd)
}
{ print }
' file
[IRRELEVANT STUFF] msg=43
I think you need to do something like this:
sed -r "s/.*msg=([A-Z0-9]*).*/xxd -r -p <<< \1/e" input

Regex w/grep against tnsnames.ora

I am trying to print out the contents of a TNS entry from the tnsnames.ora file to make sure it is correct from an Oracle RAC environment.
So if I do something like:
grep -A 4 "mydb.mydomain.com" $ORACLE_HOME/network/admin/tnsnames.ora
I will get back:
mydb.mydomain.com =
(DESCRIPTION =
(ADDRESS =
(PROTOCOL = TCP)(HOST = myhost.mydomain.com)(PORT = 1521))
  (CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME=mydb)))
Which is what I want. Now I have an environment variable being set for the JDBC connection string by an external program when the shell script gets called like:
export $DB_URL=#myhost.mydomain.com:1521/mydb
So I need to get TNS alias mydb.mydomain.com out of the above string. I'm not sure how to do multiple matches and reorder the matches with regex and need some help.
grep #.+: $DB_URL
I assume will get the
#myhost.mydomain.com:
but I'm looking for
mydb.mydomain.com
So I'm stuck at this part. How do I get the TNS alias and then pipe/combine it with the initial grep to display the text for the TNS entry?
Thanks
update:
#mklement0 #Walter A - I tried your ways but they are not exactly what I was looking for.
echo "#myhost.mydomain.com:1521/mydb" | grep -Po "#\K[^:]*"
echo "#myhost.mydomain.com:1521/mydb" | sed 's/.*#\(.*\):.*/\1/'
echo "#myhost.mydomain.com:1521/mydb" | cut -d"#" -f2 | cut -d":" -f1
echo "#myhost.mydomain.com:1521/mydb" | tr "#:" "\t" | cut -f2
echo "#myhost.mydomain.com:1521/mydb" | awk -F'[#:]' '{ print $2 }'
All these methods get me back: myhost.mydomain.com
What I am looking for is actually: mydb.mydomain.com
Note:
- For brevity, the commands below use bash/ksh/zsh here-string syntax to send strings to stdin (<<<"$var"). If your shell doesn't support this, use printf %s "$var" | ... instead.
The following awk command will extract the desired string (mydb.mydomain.com) from $DB_URL (#myhost.mydomain.com:1521/mydb):
awk -F '[#:/]' '{ sub("^[^.]+", "", $2); print $4 $2 }' <<<"$DB_URL"
-F'[#:/]' tells awk to split the input into fields by either # or : or /. With your input, this means that the field of interest are part of the second field ($2) and the fourth field ($4). The sub() call removes the first .-based component from $2, and the print call pieces together the result.
To put it all together:
domain=$(awk -F '[#:/]' '{ sub("^[^.]+", "", $2); print $4 $2 }' <<<"$DB_URL")
grep -F -A 4 "$domain" "$ORACLE_HOME/network/admin/tnsnames.ora"
You don't strictly need intermediate variable $domain, but I've added it for clarity.
Note how -F was added to grep to specify that the search term should be treated as a literal, so that characters such as . aren't treated as regex metacharacters.
Alternatively, for more robust matching, use a regex that is anchored to the start of the line with ^, and \-escape the . chars (using shell parameter expansion) to ensure their treatment as literals:
grep -A 4 "^${domain//./\.}" "$ORACLE_HOME/network/admin/tnsnames.ora"
You can get a part of a string with
# Only GNU-grep
echo "#myhost.mydomain.com:1521/mydb" | grep -Po "#\K[^:]*"
# or
echo "#myhost.mydomain.com:1521/mydb" | sed 's/.*#\(.*\):.*/\1/'
# or
echo "#myhost.mydomain.com:1521/mydb" | cut -d"#" -f2 | cut -d":" -f1
# or, when the string already is in a var
echo "${DB_URL#*#}" | cut -d":" -f1
# or using a temp var
tmpvar="${DB_URL#*#}"
echo "${tmpvar%:*}"
I had skipped the alternative awk, that was given by #mklement0 already:
echo "#myhost.mydomain.com:1521/mydb" | awk -F'[#:]' '{ print $2 }'
The awk solution is straight-forward, when you want to use the same approach without awk you can do something like
echo "#myhost.mydomain.com:1521/mydb" | tr "#:" "\t" | cut -f2
or the ugly
echo "#myhost.mydomain.com:1521/mydb" | (IFS='#:' read -r _ url _; echo "$url")
What is happening here?
After introducing the new IFS I want to take the second word of the input. The first and third word(s) are caught in the dummy var's _ (you could have named them dummyvar1 and dummyvar2). The pipe | creates a subprocess, so you need ()to hold reading and displaying the var url in the same process.

Continue a bash script

After the success view of the host ( IPs ) I need to ping them in order to check if they are up. SIDS file contains 2 columns with hostnames. Are there any suggestions on how to Iimprove the code below?
#!/bin/bash
LINES=`cat /home/marko/SIDS | sed "s!/!-!g" | wc -l`
for (( i=1; i<=${LINES}; i++))
do
FIRSTIP=CPE-`sed -n "${i}{p;q}" /home/marko/SIDS | awk '{print $1}'| sed "s!/!-!g"`
SECONDIP=CPE-`sed -n "${i}{p;q}" /home/marko/SIDS | awk '{print $2}'| sed "s!/!-!g"`
COUNT=$( host ${FIRSTIP} | grep address | wc -l )
if [ $COUNT -gt 0 ]
then
echo success
else
echo ${SECONDIP}
fi
done
You can just use dig, to avoid searching the output of host:
IP=$(dig +short $SERVERNAME)
Then to check, if the host is alive:
if ping -q -c $IP >/dev/null 2>&1
then
echo "OK"
fi

egrep string case

I have to grep from a file name temp which has something like this
Process State
BE_RP:1 [PL_2_3] Running
BE_RP:2 [PL_2_4] Running
BE_RP:3 [PL_2_5] Running
BE_RP:4 [PL_2_6] Running
FE_SCTP:0 [PL_2_3] Running
FE_SCTP:1 [PL_2_4] Running
BE_NMP:0 Not Running
OAM:0 Running
I need to write a egrep statement which will return the number of process which are in running or not running state.
awk '/^OAM/ { next } /Not Running[ \t]*$/{s++} END {print s, NR-s-1}' foo.txt
Prints <running> <not running>
Running
$ grep -v 'OAM' input | grep -cP '(?<!Not) Running\s*$'
6
Not Running
$ grep -v 'OAM' input | grep -cP 'Not Running\s*$'
1
sed '{
1 d
s/^[^:]*:[0-9]*[ ]*//
s/^[^]]*]//
s/^[ ]*//
}' input_file | sort | uniq -c
grep -P '^(?!OAM:0).*Running' temp | cut -f2 | wc -l