How can I match a pattern that occurs after a known pattern - regex

Consider the format of a bind dns zone file:
zone "mydomain.com" {
type slave;
file "db.mydomain";
masters {
192.168.5.15;
};
};
...
repeated several more times for other zones in the conf file.
I need to discover in a script some details about the zone.conf file.
I know the domain I am looking for so I can regex for something like '^zone "mydomain.com"'
But I need to discover the file line that occurs first after the zone name I am looking at.
I also want to discover the ip address in the masters list.
Our configuration only has one master ip so I don't have to worry about multiple ip's.
Ideas appreciated.

sed can be used to isolate the right section of the dns file, then print the next line after a pattern matched:
# sed -n '/"mydomain.com"/,/^};$/{/^zone "mydomain.com"/{n;p}}' dnsfile
type slave;
# sed -n '/"mydomain.com"/,/^};$/{/masters/{n;p}}' dnsfile
192.168.5.15;

One approach here would be to use sed to first output the zone block you are interested in, and then grab just the lines you want. This might look something like the following:
sed -n '/^zone "mydomain.com"/,/^};/p' zone.conf | sed -n -e '2p' -e '/[0-9]/p'
2p will print only the second line (first line after the zone name), and /[0-9]/p will print only lines that contain digits (ip address).

To get the next line with trimmed IP:
awk -F';' '/^ *masters/ { getline; sub(/^ */, "", $1); print $0 }' file
OUTPUT
192.168.5.15
To get zone line:
awk -F';' '/^zone "mydomain.com"/ { getline; sub(/^ */, "", $1); print $0}' file
OUTPUT
192.168.5.15

Related

Replace in line using sed

I'm creating a shell script, which reads the following list.log
1.15.2.119
1.15.86.33
1.15.251.60
1.20.178.145/31
1.37.33.24
1.54.202.216
1.58.10.126/28
1.80.225.84
1.116.240.174/30
I would like to add a /32 IP at the end of all IPs except the ones that already exist /32 something.
Example:
1.14.191.227/32
1.15.2.119/32
1.15.86.33/32
1.15.251.60/32
1.20.178.145/31
1.37.33.24/32
1.54.202.216/32
1.58.10.126/28
1.80.225.84/32
1.116.240.174/30
My return is doubling the /32
cat list.log | sed 's/$/\/32/'
1.14.191.227/32
1.15.2.119/32
1.15.86.33/32
1.15.251.60/32
1.20.178.145/31/32
1.37.33.24/32
1.54.202.216/32
1.58.10.126/28/32
1.80.225.84/32
1.116.240.174/30/32
This could be easily done in awk, please try following awk program. Written and tested with shown samples.
awk '!/\/32$/{$0=$0"/32"} 1' Input_file
Explanation: Simple explanation would be, checking condition if line doesn't ending with /32 then add /32 to current line and mentioning 1 will print edited/non-edited current line.
Using sed
$ sed 's|\.[0-9]\+$|&/32|' list.log
1.15.2.119/32
1.15.86.33/32
1.15.251.60/32
1.20.178.145/31
1.37.33.24/32
1.54.202.216/32
1.58.10.126/28
1.80.225.84/32
1.116.240.174/30
You can add /32 to the end of lines that do not contain /
sed '\,/,!s,$,/32,' list.log > newlist.log
Details:
\,/,! - find lines not containing /
s,$,/32, - and replace end of string position with /32 there.
See the online demo:
#!/bin/bash
s='1.15.2.119
1.15.86.33
1.15.251.60
1.20.178.145/31
1.37.33.24
1.54.202.216
1.58.10.126/28
1.80.225.84
1.116.240.174/30'
sed '\,/,!s,$,/32,' <<< "$s"
Output:
1.15.2.119/32
1.15.86.33/32
1.15.251.60/32
1.20.178.145/31
1.37.33.24/32
1.54.202.216/32
1.58.10.126/28
1.80.225.84/32
1.116.240.174/30

Bash regex to grab subdomain from list of urls

I have a file which contains list of URLs and I want to grab the subdomains from them.
List of URLs are:
https://www.google.com [match www]
https://www.something.random-name.domain.com [match www, something, and random-name]
https://facebook.com [don't match anything]
http://test.prod-op.bpo.yahoo.com [match test, prod-op and bpo]
I've been using the "sed" command to ditch https and http prefix and then using "awk "command to get the subdomains but the problem is I can only match the first subdomain for example:
https://www.something.random-name.domain.com
In the above example my approach would only match "www" But I want it to match "www" along with "something" and "random-name".
Input would be:
https://www.google.com
https://www.something.random-name.domain.com
https://facebook.com
http://test.prod-op.bpo.yahoo.com
Output would be:
www
www something random-name
null
test prod-op bpo
Kindly, explain me what shall be done so that I could match and extract the subdomains.
Thank you!
Here is your example file, and how to use sed to get all subdomains:
$ cat test.txt
https://www.google.com
https://www.something.random-name.domain.com
https://facebook.com
http://test.prod-op.bpo.yahoo.com
$ cat test.txt | sed -e 's/https*:\/\///; s/\.*[^\.]*\.[^\.]*$//; s/^$/null/; s/\./ /g'
www
www something random-name
null
test prod-op bpo
$
Explanation:
s/https*:\/\///; - remove protocol
s/\.*[^\.]*\.[^\.]*$//; - remove domain name and TLD
s/^$/null/; - change an empty line to null
s/\./ /g - change all dots to space
With two GNU awk:
awk -F '/' '{$0=$NF}1' file | awk -F '.' '{NF=NF-2}; NF<1{$0="null"}1'
$NF: contains last column
NF=NF-2: Removes the last two columns from current row
Output:
www
www something random-name
null
test prod-op bpo
See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR
This awk can do it in a single command:
awk -F. '{gsub(/^https?:\/\/|\.?[^.]+\.[^.]+$/, ""); $1=$1; print (/./ ? $0 : "null")}' file
www
www something random-name
null
test prod-op bpo
This might work for you (GNU sed):
sed -E 's#^https?://(.*)(\.[^.]+){2}#\1#;y/./ /;t;cnull' file
Pattern match on the url, removing everything but the required section.
Manipulate the section into the required format and print the result.
Otherwise, change the existing line to null.

Using sed to replace entire phrase

I'm using ShellScript to edit my bind dns configuration file, when add and remove zone references.
Then in "master.conf" file we have this content:
...
...
zone "mydomain.com" {
type master;
file "/var/zones/m/mydomain.com"
};
...
...
I want "remove" this entry to "mydomain.com" using "sed", but I could'n write a correct regex to this. The expression must use variable domain name and search until next close bracket and semicolon, something like this:
DOMAIN_NAME="mydomain.com"
sed -i.bak -r 's/^zone "'$DOMAIN_NAME'" \{(.*)\};$//g' /var/zones/master.conf
See that we should ignore the content between brackets, and this chunk have to replaced with "nothing".
I tried some variations of this expression, but without success.
Perhaps you could use awk?
awk -v dom="mydomain.com" '$2 ~ dom, /^};$/ {next}1' file
The , is the range operator. The range is true between the lines with dom in the second field and the line that only contains "};". next skips those lines. The rest are printed.
Use awk '...' file > tmp && mv tmp file to overwrite the original file.
Try the below sed script it should work
Code:
sed -i '/"mydomain.com" [{]/{
:loop
N
s/[}][;]/&/
t end
b loop
:end
d
}' master.conf
Input:
zone "myd.com" {
type master;
file "/var/zones/m/mydomain.com"
};
zone "mydomain.com" {
type master;
file "/var/zones/m/mydomain.com"
};
Output:
zone "myd.com" {
type master;
file "/var/zones/m/mydomain.com"
};
If it doesn't have to be a one liner, you can use 'grep' to get the line numbers, and then use 'sed' to delete the entire stanza from the line numbers.
See Delete specific line number(s) from a text file using sed?

Using multiple sed commands

Hi I'm looking to search through a file and output the values of a line that matches the following regex with the matching text removed, I don't need it output to a file. This is what I am currently using and it is outputting the required text but multiple times:
#!/bin/sh
for file in *; do
sed -e 's/^owner //g;p;!d ; s/^admin //g;p;!d ; s/^loc //g;p;!d ; s/^ser //g;p;!d' $file
done
The preferred format would be something like this so I could have control over what happens inbetween:
for file in *; do
sed 's/^owner //g;p' $file | head -1
sed 's/^admin //g;p' $file | head -1
sed '/^loc //g;p' $file | head -1
sed '/^ser //g;p' $file | head -1
done
An example input file would be the following:
owner sys group
admin guy
loc Q-30934
ser 18r9723
comment noisy fan is something
and the required output is the following:
sys group
guy
Q-30934
18r9723
You're giving sed the p (for Print) command several times. It prints the entire line each time. And unless you tell it not to with the -n option, sed will print the line at the end anyway.
You also give the !d command multiple times.
Edited after you added the multiple-sed version: instead of using head -q, just use -n to avoid printing lines you don't want. Or even use q (Quit) to stop processing after printing the bit you do want.
For instance:
sed -n '/^owner / { s///gp; q; }' $file
The {} group the substitution and quit commands together, so that they are both executed if and only if the pattern is matched. Having used the pattern in the address at the beginning, you can leave it out of the s command. So that command is short for:
sed -n '/^owner / { s/^owner //gp; q; }' $file
I'd suggest:
sed -n -e '/^owner / { s///; p; }' \
-e '/^admin / { s///; p; }' \
-e '/^loc / { s///; p; }' \
-e '/^ser / { s///; p; }' \
*
sed is perfectly capable of reading many files, so the loop control is unnecessary (you aren't doing per-file I/O redirection, for example) and it's reasonable to list the files after the rest of the sed command (that's the * on its own). If you've got a more modern version of sed (e.g. GNU sed), you can combine the patterns into a single line:
sed -r -n -e '/^(owner|admin|loc|ser) / { s///; p; }' *
This might work for (GNU sed):
sed '0,/^owner /{//s///p};0,/^admin /{//s///p};0,/^loc /{//s///p};0,/^ser /{//s///p}' file
Creates a series of toggle switches, one for each of the desired strings. The switches apply once only throughout the file for each string i.e. only the first occurence of each string is printed.
An alternative and depending on file sizes maybe quicker method:
sed -rn '1{x;s/^/owner admin loc ser /;x};/^(owner |admin |loc |ser )/{G;/^(owner |admin |loc |ser )(.*\n.*)\1/!b;s//\2/;P;/\n$/q;s/.*\n//;h}' file
This preps the hold space with the desired strings. For only those lines that contain the desired strings, append the hold space and check if the current line needs to be amended. Match the desired string with the same string in the hold space. If the line has already appeared the match will fail and the line can be disregarded. If the line is yet to be amended, the desired string is removed from the current line and then the first half of the line is printed. If no strings appear in the remaining half of the line the process is over and can be quit. Otherwise remove the first half of the string and replace the hold space with the desired string removed.

sed: return last occurrence match until end of file

Using sed, how do I return the last occurance of a match until the End Of File?
(FYI this has been simplified)
So far I've tried:
sed -n '/ Statistics |/,$p' logfile.log
Which returns all lines from the first match onwards (almost the entire file)
I've also tried:
$linenum=`tail -400 logfile.log | grep -n " Statistics |" | tail -1 | cut -d: -f1`
sed "$linenum,\$!d" logfile.log
This works but won't work over an ssh connection in one command, really need it all to be in one pipeline.
Format of the log file is as follows:
(There are statistics headers with sub data written to the log file every minute, the purpose of this command is to return the most recent Statistics header together with any associated errors that occur after the header)
Statistics |
Stuff
More Stuff
Even more Stuff
Statistics |
Stuff
More Stuff
Error: incorrect value
Statistics |
Stuff
More Stuff
Even more Stuff
Statistics |
Stuff
Error: error type one
Error: error type two
EOF
Return needs to be:
Statistics |
Stuff
Error: error type one
Error: error type two
Your example script has a space before Statistics but your sample data doesn't seem to. This has a regex which assumes Statistics is at beginning of line; tweak if that's incorrect.
sed -n '/^Statistics |/h;/^Statistics |/!H;$!b;x;p'
When you see Statistics, replace the hold space with the current line (h). Otherwise, append to the hold space (H). If we are not at the end of file, stop here (b). At end of file, print out the hold space (x retrieve contents of hold space; p print).
In a sed script, commands are optionally prefixed by an "address". Most commonly this is a regex, but it can also be a line number. The address /^Statistics |/ selects all lines matching the regular expression; /^Statistics |/! selects lines not matching the regular expression; and $! matches all lines except the last line in the file. Commands with no explicit address are executed for all input lines.
Edit Explain the script in some more detail, and add the following.
Note that if you need to pass this to a remote host using ssh, you will need additional levels of quoting. One possible workaround if it gets too complex is to store this script on the remote host, and just ssh remotehost path/to/script. Another possible workaround is to change the addressing expressions so that they don't contain any exclamation marks (these are problematic on the command line e.g. in Bash).
sed -n '/^Statistics |/{h;b};H;${x;p}'
This is somewhat simpler, too!
A third possible workaround, if your ssh pipeline's stdin is not tied up for other things, is to pipe in the script from your local host.
echo '/^Statistics |/h;/^Statistics |/!H;$!b;x;p' |
ssh remotehost sed -n -f - file
If you have tac available:
tac INPUTFILE | sed '/^Statistics |/q' | tac
This might work for you:
sed '/Statistics/h;//!H;$!d;x' file
Statistics |
Stuff
Error: error type one
Error: error type two
If you're happy with an awk solution, this kinda works (apart from getting an extra blank line):
awk '/^Statistics/ { buf = "" } { buf = buf "\n" $0 } END { print buf }' input.txt
sed ':a;N;$!ba;s/.*Statistics/Statistics/g' INPUTFILE
should work (GNU sed 4.2.1).
It reads the whole file to one string, then replaces everything from the start to the last Statistics (word included) with Statistics, and prints what's remaining.
HTH
This might also work, slightly more simple version of the sed solution given by the others above:
sed -n 'H; /^Statistics |/h; ${g;p;}' logfile.log
Output:
Statistics |
Stuff
Error: error type one
Error: error type two