I have a log file which has bunch of lines, where each bunch is separated by blank lines. I want to grep certain lines (containing common pattern) from each bunch of line. Each bunch of line is about a mail. A sample log file is as follows:
#START#
03:48:19:798: : <23/08/2012 03:48:19:019>
03:48:19:798: : <---23/08/2012 03:48 --->
03:48:19:799: : MAIL FROM IP=1.2.3.4
03:48:19:799: : START CHECKING OF IPLIMIT
03:48:19:799: : STOP CHECKING OF IPLIMIT
03:48:20:848:In : MAIL FROM: <a#abc.com>
03:48:20:848: : [A:A:A]
03:48:20:849: : max attach size-->5242880
03:48:20:856: : User Is Authenticated with "a#abc.com and domain abc.com"
03:48:20:856: : Passed
03:48:20:987:In : RCPT TO: <x#xyz.com>
03:48:20:987: : email x#xyz.com
03:48:20:992: : [A:A:A]
03:48:20:999: : passed
03:48:20:999:Inside the Store Mails
03:48:20:999: : BCC feature is not applicable x#xyz.com
03:48:21:000: : BCC feature is not applicable from a#abc.com
03:48:21:000:Inside the Store
03:48:21:132:In : RCPT TO: <y#xyz.com>
03:48:21:132: : email y#xyz.com
03:48:21:133: : [A:A:A]
03:48:21:140: : passed
03:48:21:140:Inside the Store Mails
03:48:21:140: : BCC feature is not applicable y#xyz.com
03:48:21:140: : not authenticated
03:48:21:140:Inside the Store
03:48:21:271: : Data Received
03:50:32:049: : 552 Size Limit Exceeded(5242880)
03:50:32:049: : File Moved in LargeSize Folder....
03:50:32:049: : File Moved in LargeSize Folder....
03:50:32:049: : Connection closed
03:50:32:049: : File Deleted /home/Mail//mailbox/LargeSize/x#xyz.com:24085.444724474357(1345673901000)
03:50:32:051: : File Deleted /home/Mail//mailbox/LargeSize/y#xyz.com:39872.512978520455(1345673901140)
MAIL DATA : : 6815779 Bytes
Total: Conn : 16713 Quit By Host : 5565 Stored : 11134 Loop:0
#END#
W A R N I N G ---------------W A R N I N G
...Waiting for activity on port Total Thread Started & 16732 Stoped 16730
#START#
03:56:20:790: : <23/08/2012 03:56:20:020>
03:56:20:790: : <---23/08/2012 03:56 --->
03:56:20:791: : MAIL FROM IP=2.3.4.5
03:56:20:792: : IP IS FRIEND IN WHITELIST
03:56:20:834:In : MAIL FROM:<y#xyz.com>
03:56:20:834: : [A:A:A]
03:56:20:834: : null
03:56:20:834: : Passed
03:56:20:834:In : RCPT TO: <a#abc.com>
03:56:20:834: : email a#abc.com
03:56:20:835: : Mailing List
03:56:20:835: : [A:A:A]
03:56:20:836: : passed
03:56:20:836: : Proceesing maillist
03:56:20:839: : Data Received
03:56:20:865: : /home/Mail//mailbox/MailingList/a#abc.com:79602.39544573233(1345674380836) Msg Queued For Delivery
03:56:20:865: : Msg forward successfully
03:56:20:865: : /home/Mail//mailbox/MailingList/M14310.39892966699(1345674380837) Msg Queued For Delivery
MAIL DATA : : 27985 Bytes
Total: Conn : 16732 Quit By Host : 5582 Stored : 11135 Loop:0
#END#
...Waiting for activity on port Total Thread Started & 16735 Stoped 16731
#START#
03:56:23:957: : <23/08/2012 03:56:23:023>
03:56:23:957: : <---23/08/2012 03:56 --->
03:56:23:958: : MAIL FROM IP=2.3.4.5
03:56:23:959: : IP IS FRIEND IN WHITELIST
03:56:23:999:In : MAIL FROM: <x#xyz.com>
03:56:23:999: : [A:A:A]
03:56:23:999: : null
03:56:23:999: : Passed
03:56:23:999:In : RCPT TO: <y#xyz.com>
03:56:23:999: : email y#xyz.com
03:56:24:000: : [A:A:A]
03:56:24:007: : passed
03:56:24:008:Inside the Store Mails
03:56:24:009: : BCC feature is not applicable y#xyz.com
03:56:24:009: : not authenticated
03:56:24:009:Inside the Store
03:56:24:009: : Data Received
03:56:24:053: : /home/Mail//mailbox/External/y#xyz.com:50098.70335800691(1345674384009) Msg Queued For Delivery
03:56:24:054: : Msg forward successfully
MAIL DATA : : 28276 Bytes
Total: Conn : 16735 Quit By Host : 5582 Stored : 11136 Loop:0
#END#
Here, a#abc.com is an external mail id, and x#xyz.com, y#xyz.com are internal mail ids.
For each mail, the bunch of lines starting from #START# to #END# are generated.
From each bunch of lines I want to run some pattern matching. I only want those bunch of lines where mail is from an internal email id to external email id (second bunch of line).
I don't want bunch of lines where mail is from external email address/id to internal email id (1st bunch of line), or from an internal email id to internal email id. (3rd bunch of line).
And after I have the bunch of line where mail is from internal to external, I want to extract the line containing the word FROM and TO.
I tried using the RS, ORS, FS and OFS variables of awk to convert each bunch of line starting from and ending at #START# to make a single-line record, but couldn't. I could not replace the newlines by a separator such as | or ~. Also, I don't now how to run multiple pattern matching on each resource record.
I tried using /PATTERN/ option, but then could not run the grep command using system() function to get the lines to check the domain names. it gave me errors: sh: 1: not found. Could not break through it. I used the code:
if ($0 ~ /FROM/) { print $0 | system("egrep -i 'FROM|TO'") }
Also, if I try to export each record using following type of code, its not working:
for i in $(cat log_file | awk_file_givin_1_resource_record_at_a_time) ; do pattern_matching_commands ; done
It's no working cause the pattern matching is working on a line at a time, while I want it to work on the entire bunch at a time.
I think the following BASH script would work well, but you should benchmark it for the size of your logs:
#!/bin/bash
INTERNAL_DOMAINS="${1:-xyz.com|xyz.net}"
declare -i LINES BYTES VALIDS
LINES=0
BYTES=0
VALIDS=0
STATUS=stopped
while read LINE
do
if [ "$STATUS" = stopped ]
then
if [ "${LINE:0:7}" = "#START#" ]
then
STATUS=started
PARA=""
fi
else
if [ "${LINE:0:5}" = "#END#" ]
then
if [ $STATUS = valid ]
then
VALIDS+=1
echo "$PARA" | egrep -w "FROM|TO"
echo -e "$VALIDS matched\t----------------------------------------"
fi
STATUS=stopped
elif (echo "$LINE" | fgrep -q "RCPT TO") && (echo "$LINE" | egrep -qiv "#($INTERNAL_DOMAINS)")
then
STATUS=valid
PARA+="$LINE
"
else
PARA+="$LINE
"
fi
fi
LINES+=1
BYTES+=${#LINE}
BYTES+=1
echo -en "\rRead: lines: $LINES | kB: $(($BYTES/1024)) | matches: $VALIDS " >&2
done
You should set the above script as executable and run it like this to get progress output:
time ./filter.sh "one.int.com|two.int.com" < sample.log > report.out
If there is always a blank line between records, and never a blank line within a record, use awk's “paragraph mode”: set RS to the empty string.
awk -v RS= '
/^[0-9:]*In : MAIL FROM: <[^<>]*#example\.com>$/ &&
/^[0-9:]*In : RCPT TO: <[^<>]*#example\.com>$/ { … }
'
If you really need to use the #START# and #END# markers, accumulate data in variables as you go along. Do the processing then reset the variables when you reach #END#. Disable processing until the next #START# if necessary.
BEGIN { in_record = 1; }
/^#START#$/ { in_record = 1; }
!in_record { next; }
/^[0-9:]*In : MAIL FROM: <([^<>]*)>$/ { from = $0; sub(/.*</, "", from); sub(/>.*/, "", from); }
…
/^#END#$/ {
/* processing goes here */
from = "";
in_record = 0;
}
Related
I've a use-case to find records from application log file that contains specific keywords.
I've tried this using grep but it uses \n as a line separator and hence the logs(with \n in the messages) are partially fetched.
A sample application log file(all of them are separate lines,(in other words) with \n at the end) :
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
2017-11-22 03:12:23 LogManager : Currently processing data {Name: Dummy}
Fetching last name
{LastName : Value}
SomeRandomMessage
2017-11-22 03:12:23 LogManager : SomeRandomMessage
Currently processing data {Name: Dummy2}
Fetching last name
SomeRandomMessage
{LastName : Value3}
.
.
.
.
I want to use YYYY-MM-DD HH:MM:SS as a record separator and then within records, find if it contains Hello and World(for example).
Expected output :
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
What I've tried :
grep 'Hello' fileName
>>
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Using any POSIX awk:
$ cat tst.awk
/^[0-9]{4}(-[0-9]{2}){2} [0-9]{2}(:[0-9]{2}){2} / {
prt()
rec = $0
next
}
{ rec = rec ORS $0 }
END {
prt()
}
function prt() {
if ( (rec ~ /Hello/) && (rec ~ /World/) ) {
print rec
}
}
$ awk -f tst.awk file
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
Since these are Logs which are dealing with and you mentioned they are in same format, if this is the case then try following code in GNU awk. Here is the Online Demo for used regex((^|\n)[0-9]{4}(-[0-9]{2}){2} ([0-9]{2}:){2}[0-9]{2} LogManager : [^{]*{Name: Hello}\nFetching last name\n{LastName : World) in GNU awk code.
awk -v RS="" '
{while(match($0,/(^|\n)[0-9]{4}(-[0-9]{2}){2} \
([0-9]{2}:){2}[0-9]{2} LogManager : \
[^{]*{Name: Hello}\nFetching last name\n\
{LastName : World/)){
print substr($0,RSTART,RLENGTH)
$0=substr($0,RSTART+RLENGTH)
}
}
' Input_file
I want to use YYYY-MM-DD HH:MM:SS as a record separator
You may use this gnu-awk command:
awk -v RS='[0-9]{4}(-[0-9]{2}){2} ([0-9]{2}:){2}[0-9]{2}' '
/Hello/ && /World/ {printf "%s", RT $0}' file
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
Here -v RS='[0-9]{4}(-[0-9]{2}){2} ([0-9]{2}:){2}[0-9]{2}' will set record separator to date-time string and when we match Hello and World a record is printed after RT i.e. record separator text.
For non-gnu POSIX awk you can consider this simple sed | awk solution:
sed -E 's/^[0-9]{4}(-[0-9]{2}){2} [0-9]{2}(:[0-9]{2}){2} /\n&/' file |
awk -v RS= '/Hello/ && /World/'
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
UPDATE - original answer (see edit/revision) made a few assumptions based on OP's sample input; OP has since stated (per comments) that the sample input is not representative of an actual log file ...
Per comments from OP:
search pattern(s) won't necessarily reside in the 1st (and/or last) line of a log entry
a log entry may have a variable number of lines
cannot rely on the string LastName being on the last line of the log entry (and at this point I'm going to assume a log entry may not even contain the string LastName)
Assumptions:
a log entry will always start with a date of the format YYYY-MM-DD as the first field of the first line of said log entry
a search pattern will not span multiple lines
need to support up to 2 search patterns (more could be added with a redesign)
Adding some additional sample data:
$ cat fileName
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
last line for this entry
2017-11-22 03:12:23 LogManager : Currently processing data {Name: Dummy}
Fetching last name
{LastName : Value}
last line?
nope, this is the last line
2017-11-22 05:17:33 LogManager : Currently processing data {Name: Dummy2}
Fetching last name
{LastName : Value3}
2017-11-22 12:13:02 LogManager : Currently processing data {Name: WhoYaCalinDummy}
Fetching last name
{LastName : WhoMe}
One awk idea:
findme1='Hello'
findme2='World'
awk -v ptn1="${findme1}" -v ptn2="${findme2}" '
function test_and_print() {
if (log_entry ~ ptn1 && log_entry ~ ptn2) # if ptn1/ptn2 show up anywhere in our log entry then ...
print log_entry # print it to stdout
log_entry="" # reset our variable
}
BEGIN { date_regex="[0-9]{4}-[0-9]{2}-[0-9]{2}" }
$1 ~ date_regex { test_and_print() } # new log entry so test the previous entry
{ log_entry= log_entry (log_entry ? RS : "") $0 } # append current line to current log entry
END { test_and_print() } # test the last entry
' fileName
For findme1='Hello'; findme2='World' this generates:
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
last line for this entry
For findme1='Hello'; findme2='Bob' this generates:
# no output
For findme1='WhoMe'; findme2='' this generates:
2017-11-22 12:13:02 LogManager : Currently processing data {Name: WhoYaCalinDummy}
Fetching last name
{LastName : WhoMe}
For findme1='XXXX'; findme2='' this generates:
# no output
Input file here is assumed to be stack. Keyword search is performed with sed, although that is personal preference; you may replace that with grep if you wish. I find grep a tiny bit less reliable, myself.
The results of all successful keyword searches are sent to output.
#!/bin/sh -x
cat > ed1 <<EOF
/2017/
ka
/LastName/
kb
a',b'w f1
a',b'd
wq
EOF
next () {
[[ -s stack ]] && main
exit 0
}
main () {
ed -s stack < ed1
test=$(sed -n '/ello/p' f1)
[[ -n "${test}" ]] && echo "${test}" >> output
rm -v ./f1
next
}
next
I use the following code to load some text file with emails
and create users in the system with user password.
the text file contain emails like following
abc#gmail.com
BDD#gmail.com
ZZZ#gmail.com
In case the name is coming with upper case I convert it to lower case, I was able to make it work.
Now I need to support another input instead of email
e.g.
P123456
Z877777
but now I dont want for this type of input to convert it to lower case
someting like
if(emailpattern )
convert to lower
else
Not
This is the code which works but I failed to make it work...
for user in $(cat ${users} | awk -F";" '{ print $1 }'); do
user=$(echo ${user} | tr "[:upper:]" "[:lower:]")
log "cf create-user ${user} ${passwd}"
#Here we are creating email user in the sys
cf create-user ${user} ${passwd} 2>&1 |
tee -a ${dir}/${scriptname}.log ||
{ log "ERROR cf create-user ${user} failed" ;
errorcount=$[errorcount + 1]; }
done
You can use:
while IFS= read -r user; do
# convert to lowercase only when $user has # character
[[ $user == *#* ]] && user=$(tr "[:upper:]" "[:lower:]" <<< "$user")
log "cf create-user ${user} ${passwd}"
cf create-user ${user} ${passwd} 2>&1 |
tee -a ${dir}/${scriptname}.log ||
{ log "ERROR cf create-user ${user} failed" ;
errorcount=$[errorcount + 1]; }
done < <(awk -F ';' '{ print $1 }' "$users")
Assumptions:
input file consists of email addresses or names, each on a separate line
email addresses are to be converted to lower case
names are to be left as is (ie, no conversion to lower case)
all of the log/cf/tee/errorcount code functions as desired
Sample input file:
$ cat userlist
abc#gmail.com
BDD#gmail.com
ZZZ#gmail.com
P123456
Z877777
We'll start by using awk to conditionally convert email addresses to lower case:
$ awk '/#/ {$1=tolower($1)} 1' userlist
abc#gmail.com
bdd#gmail.com
zzz#gmail.com
P123456
Z877777
first we'll run the input file (userlist) through awk ...
/#/ : for lines that include an email address (ie, contains #) ...
$1=tolower($1) : convert the email address (field #1) to all lowercase, then ...
1 : true for all records and implies print all inputs to output
Now pipe the awk output to a while loop to perform the rest of the operations:
awk '/#/ {$1=tolower($1} 1}' userlist | while read user
do
log "cf create-user ${user} ${passwd}"
#Here we are creating email user in the sys
cf create-user ${user} ${passwd} 2>&1 |
tee -a ${dir}/${scriptname}.log ||
{ log "ERROR cf create-user ${user} failed" ;
errorcount=$((errorcount + 1)) ;
}
done
updated to correctly increment errorcount by 1
bash can lower-case text:
while IFS= read -r line; do
[[ $line == *#* ]] && line=${line,,}
# do stuff with "$line"
done
I'm attempting to write something that can process Nagios config files containing a block of text, and either add # to the start of each line, or delete the block. (To function as a kind of mass check removal in Nagios itself).
Ex:
define service {
service_description Service 1
use Template 1
host_name Host 1
check_command Command A
}
define service {
service_description Service 2
use Template 1
host_name Host 1
check_command Command B
}
define service {
service_description Service 3
use Template 1
host_name Host 1
check_command Command C
}
Would need to change to this (or equivalent):
define service {
service_description Service 1
use Template 1
host_name Host 1
check_command Command A
}
#define service {
# service_description Service 2
# use Template 1
# host_name Host 1
# check_command Command B
#}
define service {
service_description Service 3
use Template 1
host_name Host 1
check_command Command C
}
Is there a way to regex match the block between "define service {" and "}", and containing either "Service 2" or "Command "B", and append/delete the block via sed/awk/perl, etc?
Thanks.
sed '
# take each paragraph one by one
/define service {/,/}/{
# inside paragraphe, add each line (one at a time) in buffer
H
# if not the end of paragraphe, delete the line (from output) (and cycle to next line)
/}/!d
# empty current line (the last of paragraphe) and swap with the whole buffer
s/.*//;x
# if it contain Service2 (and Command B on next line) goto to label comm
/Service2/ b comm
/Command B/ b comm
# goto label head (so no Service2 nor command b in paragraphe)
b head
:comm
# comment each line of paragraphe (multiline with \n as new line)
s/\n/&#/g
: head
# remove first character (a new line due to use of H and not h on first line)
s/.//
}
# default behaviour that print the content
' YourFile
Self commented
Here's a regex that creates the match you want:
/define service\s\{.*(Service 2|Command B).*\}/s
You can test it here. I have no experience with sed, awk or perl, so I will not make any attempt to create a replacement.
sed -n '/define service {/,/}/{:1;N;/\}/!b 1;/Command B\|Service 2/{s/\n/\n#/g;s/^/#/g;p;d;b 1};p;d;b 1}' file_name
This is how it works
picks the block within braces;keeps appending to pattern space till a closing brace is found;searches the patter space for the two key words; if found then appends a # in front of every new line in pattern space and prints the content. Clears the pattern space and repeats the provess
This is my TCL Script:
set test {
device#more system:/proc/dataplane/fw/application
1 : Amazon Instant Video (num of policy actions: 0)
port-proto:
http urls :
*(www.amazon.com/Instant-Video)*
dns names :
https client-hello servNames :
https server-hello servNames :
https server-certificate commonNames :
Application stats :
Bytes Uploaded : 0
Bytes Download : 0
Num Flows : 0
2 : SIP (num of policy actions: 0)
port-proto:
Proto 6-6, sport 0-65535, dport 5060-5061
Proto 17-17, sport 0-65535, dport 5060-5061
http urls :
dns names :
https client-hello servNames :
https server-hello servNames :
https server-certificate commonNames :
Application stats :
Bytes Uploaded : 0
Bytes Download : 0
Num Flows : 0
3 : Photobucket (num of policy actions: 0)
port-proto:
http urls :
*(www.pbsrc.com)*
*(www.photobucket.com)*
dns names :
*.photobucket.co (2)
*.photobucket.com (2)
https client-hello servNames :
https server-hello servNames :
https server-certificate commonNames :
Application stats :
Bytes Uploaded : 34
Bytes Download : 44
Num Flows : 78
4 : Filestub (num of policy actions: 0)
port-proto:
http urls :
*(www.filestub.com)*
dns names :
*.filestub.com (2)
https client-hello servNames :
https server-hello servNames :
https server-certificate commonNames :
Application stats :
Bytes Uploaded : 0
Bytes Download : 0
Num Flows : 0
--More--
device#
}
set lines [split $test \n] ; # split using new line char(\n)
set data [join $lines :]
if { [regexp {Photobucket.*(Bytes Uploaded : .* Bytes Download:)} $data x y]} {
set y [string trimright $y {: }]
puts "Bytes uploaded : $y"
}
I am trying to find the Bytes downloaded and uploaded to the application called "Photobucket" in $test variable.
STEPS that script to do:
1. First identify the word "Photobucket"
2. Then grep for "Bytes Uploaded : <any number> and Bytes Download : <any number>, Num Flows : <any number> for that application "Photobucket".
Output should be:
Application Name : "Photobucket"
Bytes Uploaded : 34
Bytes Download : 44
Num Flows : 78
When I run my script I am getting only the last line in $test.
Please help me to fix this.
Thanks,
Kumar
First, I think that you didn't put the regex that you are using in your question because your regex doesn't match at all because of a missing space. It should be:
Photobucket.*(Bytes Uploaded : .* Bytes Download :)
Now, the problem with this regex is that .* is greedy and will match till the end of the string (since it matches anything and everything), and then backtrack one character at a time until the whole regex is matched (that is where the last Bytes Uploaded : and Bytes Download : is matched), or if no match is found, then the regex fails to match. What you need is to make is the .* lazy (or match as little as possible) with the ? modifier:
Photobucket.*?(Bytes Uploaded : .*? Bytes Download :)
The above will match the correct part, except you will have an incorrect value in y since you will also have Bytes Uploaded and such. The trim cannot remove those. You might thus change the regex a bit more:
Photobucket.*?Bytes Uploaded : (\S+):
This will put non space characters, matched by (\S+) into the variable y. You don't need to trim after that.
And you don't need to split and rejoin if you change the regex:
if { [regexp {Photobucket.*?Bytes Uploaded : (\S+)\s} $test - y]} {
puts "Bytes uploaded : $y"
}
To get all the three values, you then just need to add them at the end:
if { [regexp {Photobucket.*?Bytes Uploaded : (\S+)\s+Bytes Download : (\S+)\s+Num Flows : (\S+)\s+} $test - x y z]} {
puts "Bytes uploaded : $x"
puts "Byte download : $y"
puts "Num flows : $z"
}
You can use string commands instead of a giant regex
set stats {"Bytes Uploaded" "Bytes Download" "Num Flows"}
set photobucket_idx [string first Photobucket $test]
foreach stat $stats {
set digits_start [expr {[string first "$stat : " $test $photobucket_idx] + [string length "$stat : "]}]
set digits_end [expr {[string first \n $test $digits_start] - 1}]
set digits($stat) [string range $test $digits_start $digits_end]
}
parray digits
outputs
digits(Bytes Download) = 44
digits(Bytes Uploaded) = 34
digits(Num Flows) = 78
Using ActiveState perl 5.8 on windows. I am placing the results of sc qc MyServiceName into a variable.
$MSSQLResults=`sc qc $MSSQLServiceName`;
print "MSSQLResults $MSSQLResults";
If I print the variable to STDOUT I get something like:
[SC] QueryServiceConfig SUCCESS
SERVICE_NAME: MSSQL$INSTANCE1
TYPE : 10 WIN32_OWN_PROCESS
START_TYPE : 3 DEMAND_START
ERROR_CONTROL : 1 NORMAL
BINARY_PATH_NAME : "C:\Program Files\Microsoft SQL Server\MSSQL10_50.INSTANCE1\MSSQL\Binn\sqlservr.exe" -sINSTANCE1
LOAD_ORDER_GROUP :
TAG : 0
DISPLAY_NAME : SQL Server (INSTANCE1)
DEPENDENCIES :
SERVICE_START_NAME : TESTLAB\svc_SQLServer
The string I want returned from a grep or regex match is TESTLAB\svc_SQLServer
. Should I use grep, or regex, or something else? What line of perl would accomplish what I want? The text TESTLAB\svc_SQLServer will vary depending on which machine I run it on.
If I understand you correctly, you have a scalar variable (e.g $MSSQLResults) which you want to search. In that case:
if (my ($service_start_name) = $MSSQLResults=~ m/SERVICE_START_NAME\s+:\s+(.*)/m) {
# do something
}