grep lines that have n occurrences of a given character - regex

I have a file with paths to files. For example looking like this:
/home/smth/a/file1
/home/smth/a/file2
/home/smth/b/file1
/home/smth/a/b/file1
I have a variable F_COUNT=4 I want to select only lines, where the '/' character appears exactly $F_COUNT times.
It would return this:
/home/smth/a/file1
/home/smth/a/file2
/home/smth/b/file1
I tried using regex with grep, specifically i tried grep "'/'\{0,$F_COUNT\}" , but this doesn't work. How can I do this?

You can use awk to count fields:
F_COUNT=4
awk -F/ -v num="$F_COUNT" 'NF == num+1' file
/home/smth/a/file1
/home/smth/a/file2
/home/smth/b/file1

Using grep as requested :
F_COUNT=4
grep -E "^(/[^/]+){$F_COUNT}$" file
Output :
/home/smth/a/file1
/home/smth/a/file2
/home/smth/b/file1

Related

Extract Source IP from log files

i want to extract "srcip=x.x.x.x" from log file in bash. my log file is like this:
2019:06:23-17:50:03 myhost ulogd[5692]: id="2021" severity="info" sys="SecureNet" sub="packetfilter" name="Packet dropped (GEOIP)" action="drop" fwrule="60019" initf="eth0" srcmac="3c:1e:04:92:6f:fb" dstmac="00:50:56:97:7c:af" srcip="185.53.91.50" dstip="192.168.50.10" proto="6" length="44" tos="0x00" prec="0x00" ttl="235" srcport="54522" dstport="5038" tcpflags="SYN"
I've wrote awk '{print $15}' to extract srcip but the problem is srcip position not same in each line. how can i extract srcip=x.x.x.x without position of that?
With any sed in any shell on every UNIX box:
$ sed -n 's/.*\(srcip="[^"]*"\).*/\1/p' file
srcip="185.53.91.50"
The following command provides the result you expect
grep -o -P 'srcip="(\d{1,3}[.]){3}\d{1,3}"' log
The option o is to print only the matched parts. The option P is to use perl-compatible regular expressions. The regex is matching srcip=<ipv4> and log is the name of the file you want to extract content from.
Here is a link to regex101 for an explanation for the regex: https://regex101.com/r/hjuZlM/2
An awk version
awk -F"srcip=" '{split($2,a," ");print FS a[1]}' file
srcip="185.53.91.50"
Split the line using the key word, then get the next field after split.

Extract substring from string with sed

I want to extract MIB-Objects from snmpwalk output. The output FILE looks like:
RFC1213-MIB::sysDescr.0.0.0.0.192.168.1.2 = STRING: "Linux debian 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u1 (2017-06-18) x86_64"
RFC1213-MIB::sysObjectID.0 = OID: RFC1155-SMI::enterprises.8072.3.2.10
..
First, I read the output file, split at character = and remove everything between RFC1213-MIB:: and .0 till the end of the string.
while read -r; do echo "${REPLY%%=*}" | sed -e 's/RFC1213-MIB::\(.*\)\.0/\1/'; done <$FILE
My current output:
sysDescr.0.0.0.192.168.1.2
sysObjectID
How can I remove the other values? Is there a better solution of extracting sysDescr, sysObjectID?
With awk:
awk -F[:.] '{print $3}'
(define : and . as field delimiters and display the 3rd field)
with sed (Gnu):
sed 's/^[^:]*::\|\.0.*//g'
(replace with the empty string all that isn't a : followed by :: at the start of the line or the first .0 and following characters until the end of the line)
Maybe you can try with:
sed 's/RFC1213-MIB::\([^\.]*\).*/\1/' $FILE
This will get everything that is not a dot (.) following the RFC1213-MIB:: string.
If you don't want to use sed, you can just use parameter substitution. sed is an external process so it won't be as fast as parameter substitution since it's a bash built in.
while IFS= read -r line; do line=${line#*::}; line=${line%%.*}; echo $line; done < file
line=${line#*::} assumes RFC1213-MIB does not have two colons and will be split from sysDescr with two colons.
line=${line%%.*} assumes sysDescr will have a . after it.
If you have more examples, that you think won't work, I can update my answer.

Delete rows with extra delimiter from csv file in unix

I have a csv file with 3 columns separated by ',' delimiter. Some values have , in data and I would like to remove the whole record. Suggest if I can do this using sed/awk,grep commands .
Input file :
monitor,display,45
keyboard,input,20
loud,speaker,output,20
mount,input,20
Expected Output :
monitor,display,45
keyboard,input,20
mount,input,20
I used grep command to filter out rows with extra commas.
grep -v '.*,.*,.*,.*' input_file > output_file.
We need to define the regex pattern between .*
-v excludes the records which match the pattern specified.
Below is how you can do the same using awk , basically you want the record in which there are exactly 3 fields
$ awk -F, 'NF==3 {print $0}' data1.txt
monitor,display,45
keyboard,input,20
mount,input,20

regex - match exactly to a string portion in awk

I have a file where one column contains strings that are composed of characters separated by ,
example:
a123456, a54321, a12312
I need to find lines that contain a specific number in the comma separated list.
example: I want to find all lines that contain only a12345.
I tried to use the following:
awk ' $1~/a12345/ {print}'
but this prints out the line containing:
a123456, a54321, a12312
because the regex is matching the first 6 characters in a123456, I guess.
My question is, how can I make an regex that will only print out the lines that contain only an exact match?
$ awk '/(^|[^[:alnum:]])a12345([^[:alnum:]]|$)/' file
$ awk '/(^|[^[:alnum:]])a123456([^[:alnum:]]|$)/' file
a123456, a54321, a12312
With GNU awk you could use word-delimiters:
$ awk '/\<a12345\>/' file
$ awk '/\<a123456\>/' file
a123456, a54321, a12312
Try using word match of grep like below:
grep -w a123456 myfile.txt
if you need in field that just starts, then use something like:
egrep -w ^a123456 myfile.txt
With awk:
awk -F ',\\s*' '$1 == "a12345"' filename
To split the line along commas (optionally followed by whitespace) and select only those lines whose first field is exactly "a12345". This will work even if the field contains characters after "a12345" that count as a word boundary, which is to say that
a12345.foo, bar, baz
is filtered out.
If more than a single field is to be tested, then you'll have to test all fields:
awk -F ',\\s*' 'function check() { for(i = 1; i <= NF; ++i) { if($i == "a12345") return 1; } return 0 } check()' filename

How can i replace the existing mac add with a new mac address using sed

I have this existing pattern:
ethernet0.generatedAddress = "00:50:56:bf:71:06"
I need to replace a mac address in the above expression with a new mac address using a sed pattern.
Note: The mac-address that needs to replaced is not same everytime.
I tried this sed expression , but no luck..
sed 's/ethernet0.generatedAddress.*=.*\"([[:xdigit:]]{1,2}:){5}[[:xdigit:]]{1,2}/ethernet0.generatedAddress = \"00:16:3e:5e:1d:01'
Thanks in Advance
Pattern:
([a-z0-9]{2}:[a-z0-9]{2}:[a-z0-9]{2}:[a-z0-9]{2}:[a-z0-9]{2}:[a-z0-9]{2})
Or the following one if uppercase letters are used
([a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2})
Replacement:
new_mac_address // for instance 00:f6:a0:ff:f1:06
Side note: As has been pointer in the comments below, escape parentheses and curly brackets if needed or use -r option
Using sed it would be something like this (just tested)
sed -r 's/(.*)([a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2}:[a-zA-Z0-9]{2})(.*)/\1\NEW_MAC_ADDRESS\3/g' file.txt
Use -i option in addition if you want to replace the file on-the-fly
Content of the tested file (file.txt)
something before ethernet0.generatedAddress = "00:50:56:bf:71:06" and something after
Demo
Why not use awk? It gives simple and easy to understand solution.
cat file
some data
ethernet0.generatedAddress = "00:50:56:bf:71:06"
more data
mac="ab:11:23:55:11:cc"
awk -v m="$mac" -F' "|"' '/ethernet0.generatedAddress/ {$2="\""m"\""}1' file
some data
ethernet0.generatedAddress = "ab:11:23:55:11:cc"
more data
It search for ethernet0.generatedAddress, if found, replace field #2 separated by " with new mac.
If one extra space is not problem, this would be more clean:
awk -v m="$mac" -F\" '/ethernet0.generatedAddress/ {$2=FS m FS}1' file
some data
ethernet0.generatedAddress = "ab:11:23:55:11:cc"
more data
Or this:
awk -v m="\"$mac\"" -F\" '/ethernet0.generatedAddress/ {$2=m}1' file