Filter SIP Messages from log file - regex

I work with VoIP solutions as a day to day job. I often get SIP messages debugs from devices I manage. Most of the times however, those logs contain more calls than I require to analyze, so it would be great if I could filter them out.
I want a tool that can, if I give it a log file and the Call-Id's I need, filter the log file to include only those SIP messages.
Unfortunately, SIP messages are more than one line, so my experience with grep is not sufficient to get it to work.
I started to program something in Perl for this, but any further than checking if I had the proper amount of parameters I didn't get. Is Perl the best language to go about this? I have included a part of the input here:
Jan 28 11:39:37.525 CET: //1393628/D5CC0586A87B/SIP/Msg/ccsipDisplayMsg:
Received:
SIP/2.0 200 OK
Via: SIP/2.0/UDP 10.218.16.2:5060;branch=z9hG4bKB22001ED5
From: "Frankeerapparaat Secretariaat" <sip:089653717#10.210.2.49>;tag=E7E0EF64-192F
To: <sip:022046187#10.210.2.49>;tag=25079324~19cc0abf-61d9-407f-a138-96eaffee1467-27521338
Date: Mon, 28 Jan 2013 10:39:32 GMT
Call-ID: D5CCA1AE-686D11E2-A881ED01-8DFA6D70#10.218.16.2
CSeq: 102 INVITE
Allow: INVITE, OPTIONS, INFO, BYE, CANCEL, ACK, PRACK, UPDATE, REFER, SUBSCRIBE, NOTIFY
Allow-Events: presence
Supported: replaces
Supported: X-cisco-srtp-fallback
Supported: Geolocation
Session-Expires: 1800;refresher=uas
Require: timer
P-Preferred-Identity: <sip:022046187#10.210.2.49>
Remote-Party-ID: <sip:022046187#10.210.2.49>;party=called;screen=no;privacy=off
Contact: <sip:022046187#10.210.2.49:5060>
Content-Type: application/sdp
Content-Length: 209
v=0
o=CiscoSystemsCCM-SIP 2000 1 IN IP4 10.210.2.49
s=SIP Call
c=IN IP4 10.210.2.1
t=0 0
m=audio 16844 RTP/AVP 8 101
a=rtpmap:8 PCMA/8000
a=ptime:20
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
Jan 28 11:39:37.529 CET: //1393628/D5CC0586A87B/SIP/Msg/ccsipDisplayMsg:
Sent:
ACK sip:022046187#10.210.2.49:5060 SIP/2.0
Via: SIP/2.0/UDP 10.218.16.2:5060;branch=z9hG4bKB2247150A
From: "Frankeerapparaat Secretariaat" <sip:089653717#10.210.2.49>;tag=E7E0EF64-192F
To: <sip:022046187#10.210.2.49>;tag=25079324~19cc0abf-61d9-407f-a138-96eaffee1467-27521338
Date: Mon, 28 Jan 2013 10:39:36 GMT
Call-ID: D5CCA1AE-686D11E2-A881ED01-8DFA6D70#10.218.16.2
Max-Forwards: 70
CSeq: 102 ACK
Authorization: Digest username="Genk_AC_1",realm="infraxnet.be",uri="sip:022046187#10.210.2.49:5060",response="9546733290a96d1470cfe29a7500c488",nonce="5V/Jt8FHd5I8uaoahshiaUud8O6UujJJ",algorithm=MD5
Allow-Events: telephone-event
Content-Length: 0
Jan 28 11:39:37.529 CET: //1393627/D5CC0586A87B/SIP/Msg/ccsipDisplayMsg:
Sent:
SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.168.8.11:5060;branch=z9hG4bK24ecaaaa6dbd3
From: "Frankeerapparaat Secretariaat" <sip:3717#192.168.8.11>;tag=e206cc93-1791-457a-aaac-1541296cf17c-29093746
To: <sip:022046187#192.168.8.28>;tag=E7E0F8A4-EA3
Date: Mon, 28 Jan 2013 10:39:32 GMT
Call-ID: fedc8f80-10615564-45df0-b08a8c0#192.168.8.11
CSeq: 101 INVITE
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, UPDATE, REFER, SUBSCRIBE, NOTIFY, INFO, REGISTER
Allow-Events: telephone-event
Remote-Party-ID: <sip:022046187#192.168.8.28>;party=called;screen=no;privacy=off
Contact: <sip:022046187#192.168.8.28:5060>
Supported: replaces
Supported: sdp-anat
Server: Cisco-SIPGateway/IOS-15.3.1.T
Session-Expires: 1800;refresher=uas
Require: timer
Supported: timer
Content-Type: application/sdp
Content-Disposition: session;handling=required
Content-Length: 247
v=0
o=CiscoSystemsSIP-GW-UserAgent 7276 9141 IN IP4 192.168.8.28
s=SIP Call
c=IN IP4 192.168.8.28
t=0 0
m=audio 30134 RTP/AVP 8 101
c=IN IP4 192.168.8.28
a=rtpmap:8 PCMA/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=ptime:20
The program I envision would take 2 or more arguments: the log file and then any amount of Call-ID's of calls I am interested in. It would then filter out only the relevant messages and print it to stdout.
Note that a single SIP message may include a blank line. The next message starts only when a new timestamp is shown.

Perhaps the following will be helpful:
use strict;
use warnings;
my %callIDs = map { $_ => 1 } splice #ARGV, 1, #ARGV - 1;
my $recordPrinted;
local $/ = '';
while (<>) {
if ( /Call-ID:\s+(.+)/ and $callIDs{$1} ) {
$recordPrinted = 1;
print;
next;
}
print if $recordPrinted and /\brtpmap\b/;
$recordPrinted = 0;
}
Usage: perl scriptName.pl logFile callID [callID]
When you send a script parameters, they end up in #ARGV. The splice takes (and removes) the elements from 1 on to build a hash's keys. The zeroth element is the log file name.
local $/ = ''; sets paragraph mode, i.e., a whole chunk of text separated by blank lines is read at a time. The regex captures the Call-ID and if a key exists for that ID, the line is printed. Also, a flag is set to indicate that a 'record' has been printed, since there may be another 'chunk' to that record.
If a record's been printed and a field is found that occurs in a second part of the record (rtpmap), that chunk is printed.

Try this
if ($subject =~ m/(?im)(Call-ID: (.+))$/) {
$result = $2;
} else {
$result = "";
}

Related

Does SNS HTTP/S delivery honor any HTTP codes?

I created a test to fill my SNS dead letter queue to help me develop code to read from this queue. Long story short, I thought an HTTP error would be easiest to simulate failures, but surprisingly, they seem to be counted as success.
In case I am doing it wrong and for the benefit of anyone else who wants to try this out, here is my methodology. I created an HTTP/s endpoint specifically for this test using a bash one liner:
while true; do echo -e "HTTP/1.1 200 OK\n" | nc -Nl 9078; echo "" && date; done
So far so good. I decided that returning a 401 code might be easiest. Capturing a 401 page output with netcat:
HTTP/1.1 401 Unauthorized
Server: nginx/1.21.0
Date: Wed, 01 Sep 2021 12:22:03 GMT
Content-Type: text/html
Content-Length: 179
Connection: keep-alive
WWW-Authenticate: Basic realm="Restricted example.com"
Strict-Transport-Security: max-age=31536000
<html>
<head><title>401 Authorization Required</title></head>
<body>
<center><h1>401 Authorization Required</h1></center>
<hr><center>nginx/1.21.0</center>
</body>
</html>
I altered my one liner accordingly:
while true; do echo -e "$(cat 401error)\n" | nc -Nl 9078; echo "" && date; done
I verified that visiting this page in Firefox would pop up a password dialog.
Come test time, SNS blunders along and delivers the message without fear. The message never appears in the DLQ:
POST /poot/testingevent HTTP/1.1
x-amz-sns-message-type: Notification
x-amz-sns-message-id: REDACTED
x-amz-sns-topic-arn: REDACTED
x-amz-sns-subscription-arn: REDACTED
x-amz-sns-rawdelivery: true
Content-Length: 24
Content-Type: text/plain; charset=UTF-8
Host: example.com:9078
Connection: Keep-Alive
User-Agent: Amazon Simple Notification Service Agent
Accept-Encoding: gzip,deflate
{"401 for sure man": 11}
Wed Sep 1 12:25:31 UTC 2021
Does anyone know? Nothing so far uncovered in duckduckgoing "http code" sns. If I can capture some other codes (403,500,etc) using netcat, I thought it might be useful to know which, if any, are honored.
Any status code outside of the range 200 - 499 will be considered as a failure and retried according to your retry policy as per https://docs.aws.amazon.com/sns/latest/dg/sns-message-delivery-retries.html. Once the max number of retries has been exhausted, the message will be delivered to a DLQ if one is configured.

Why isn't booliean expression working

I am using the following regular expression to filter some junk emails
\bNeotube\b | \bNeotubeTV\b
Here is a sample of the junk email header:
Return-path: <uranus#pschic.info>
Envelope-to: coben#jesusmylord.org
Delivery-date: Thu, 25 May 2017 14:18:58 +0200
Received: from [45.59.120.18] (port=30375 helo=pschic.info)
by ok1057.kvchosting.com with esmtp (Exim 4.89)
(envelope-from <uranus#pschic.info>)
id 1dDrj8-0002X1-Se
for coben#jesusmylord.org; Thu, 25 May 2017 14:18:58 +0200
From: "NeotubeTV" <uranus#pschic.info>
Date: Thu, 25 May 2017 06:58:18 -0500
MIME-Version: 1.0
Subject: Free TV shows, Sports and New Movies on your TV In HD?
To: <coben#jesusmylord.org>
Message-ID: `
The above expression does not work. However, if I just use
\bNeotubeTV\b
the email is filtered. Why doesn't the above OR statement work?
Thanks for your help.
Chris
It's because you have spaces around the | which is saying that you want include those spaces in your regex.

Jmeter-Regular expression extractor

My Jmeter response returns me 'Location' in the response header.I want to fetch this Location header and use it on my other requests.
Sample Start: 2015-07-24 14:46:38 CEST
Load time: 163
Latency: 163
Size in bytes: 372
Headers size in bytes: 350
Body size in bytes: 22
Sample Count: 1
Error Count: 1
Response code: 201
Response message: Processed
Response headers:
HTTP/1.1 201 Processed
X-Backside-Transport: OK OK,FAIL FAIL
Connection: Keep-Alive
Transfer-Encoding: chunked
****Location: /retail/iows/ie/en/storage/servicedocs/paxplanner/2015-07-24/eCommerce.pdf****
X-Client-IP: 127.0.0.1,10.62.26.150
Content-Type: application/octet-stream
Date: Fri, 24 Jul 2015 12:46:38 GMT
X-Archived-Client-IP: 127.0.0.1
Steps I followed:
I have used Regular expression extractor.
Enabled response header radio button with the whole location header.
Please help me to sort it out.
If you want to retrieve the Location field's value from the request's response, you might want to try the following pattern: Location:([^\r?\n]+), the first matching group will contain the value of the Location field.
Above expression is based in the following rules:
HTTP header fields are colon (":") separated <key, value> pairs.
HTTP header fields are terminated by the EOL char combination (CR and LF)
Please try this..
Location:([\s\S]*)X-Client
If it doesn't work then try to use a \ before - in X-Client (escaping -)

RegEx match IP on Mail-Header Received:

I try to fiddle a RegEx, which returns me only the Sender IP Address:
http://regexr.com?38atl
This is the RegEx I build, but cant complete:
(?<=\bReceived: from .*\[)(?:\d{1,3}\.){3}\d{1,3}
or
(?<=\bReceived: from )(.*\[)(?:\d{1,3}\.){3}\d{1,3}
So it should only match this (on lines beginning with: Received: from)
127.0.0.1
127.0.0.1
21.22.23.24
And this are a example Mail-Headers i'm search in:
To: a#domain.de
Return-Path: <t#domain.de>
X-Original-To: a#domain.de
Delivered-To: c#domain.tld
Received: from localhost (localhost [127.0.0.1])
by mail1.domain.tld (Postfix) with ESMTP id 3fT3TR72zNz8m8
for <a#domain.de>; Tue, 18 Feb 2014 14:54:35 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at mail1.domain.tld
X-Spam-Flag: YES
X-Spam-Score: 5.773
X-Spam-Level: *****
X-Spam-Status: Yes, score=5.773 tagged_above=1 required=4.5
tests=[BAYES_05=-0.5, MISSING_MID=0.497, RCVD_IN_PBL=3.335,
RCVD_IN_RP_RNBL=1.31, RDNS_DYNAMIC=0.982, TO_NO_BRKTS_DYNIP=0.139,
T_RCVD_IN_SEMBLACK=0.01] autolearn=no
Received: from mail1.domain.tld ([127.0.0.1])
by localhost (mail1.domain.tld [127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id lDJqiZjBn2t4 for <a#domain.de>;
Tue, 18 Feb 2014 14:54:34 +0100 (CET)
Received: from mail.domain.tld (pAAAAAAAA.dip0.t-ipconnect.de [21.22.23.24])
by mail1.domain.tld (Postfix) with SMTP id 3fT3TQ4Nwgz8m5
for <a#domain.de>; Tue, 18 Feb 2014 14:54:34 +0100 (CET)
Date: Tue, 18 Feb 2014 15:02:11 +0100
Sender: "From" <t#domain.de>
From: "From" <t#domain.de>
Subject: Subbbb (192.168.123.123)
Reply-To: t#domain.de
MIME-Version: 1.0
Content-type: text/plain; charset=UTF-8
Message-Id: <3fT3TR72zNz8m8#mail1.domain.tld>
Try this expression:
Received: +from[^\n]*?\[([0-9\.]+)\]
Edit:
For a PHP script try something like this (where $emailHeader contains the data you are searching):
$regex = '/Received: +from[^\\n]*?\\[([0-9\\.]+)\\]/s';
if (preg_match_all($regex, $emailHeader, $matches_out)) {
print_r($matches_out);
} else {
print('Sender IP not found');
}
The <= in the star looks funny, but other than that it seems to be working fine:
(?:\bReceived: from .*\[)((\d{1,3}\.){3}\d{1,3})(?:]\))
I believe what you're looking for is:
(?:\bReceived: from .*?\[)(?<ip>(?:\d{1,3}\.){3}\d{1,3})
the matched IP address will be in capture group named "ip".

Detecting characters in C++ char stream

I am working on a piece of arduino code that is using the BlackWidow version with wifi built in. Using the WiServer.h library, I'm using the SimpleClient.pde example with mods to send a call to a webserver that will simply return an integer - 0, 1, or 2. The end goal is to turn on a pin for the proper red, green, or yellow of a stoplight. The integers represent the aggregate state of our Hudson CI.
I'm a PHP lazy bastard, and pointers scare me. The code I am working with is
// Function that prints data from the server
void printData(char* data, int len) {
// Print the data returned by the server
// Note that the data is not null-terminated, may be broken up into smaller packets, and
// includes the HTTP header.
while (len-- > 0) {
Serial.print(*(data++));
}
}
printData() is the callback of the call to the webserver, and when run it sends the following to the serial monitor (this is 3 loops, no newline before new output):
HTTP/1.1 200 OK
Date: Thu, 10 Feb 2011 17:37:37 GMT
Server: Apache/2.2.13 (Unix) mod_ssl/2.2.13 OpenSSL/0.9.8k DAV/2 PHP/5.2.11
X-Powered-By: PHP/5.2.11
Content-Length: 1
Connection: close
Content-Type: text/html
0HTTP/1.1 200 OK
Date: Thu, 10 Feb 2011 17:37:45 GMT
Server: Apache/2.2.13 (Unix) mod_ssl/2.2.13 OpenSSL/0.9.8k DAV/2 PHP/5.2.11
X-Powered-By: PHP/5.2.11
Content-Length: 1
Connection: close
Content-Type: text/html
0HTTP/1.1 200 OK
Date: Thu, 10 Feb 2011 17:37:58 GMT
Server: Apache/2.2.13 (Unix) mod_ssl/2.2.13 OpenSSL/0.9.8k DAV/2 PHP/5.2.11
X-Powered-By: PHP/5.2.11
Content-Length: 1
Connection: close
Content-Type: text/html
0
The part that I need to identify is the 0, which could also be 1 or 2.
Instead of printData(), this function will become turnOnAppropriateLight() or something, by simply setting a pin to HIGH. This will then activate a relay, to power the corresponding LED array.
Now that I've written this up it looks like I just need to keep the last character around and do a switch based on the value. The *(data++) is the confusing part even though I know it's incrementing a pointer index...I'm just not sure how to go directly to the last char in that index. No need for this looping to spit out the result.
This is not robust AT ALL, but
Serial.print(data[len-1])
See what that gets you
this should be all you need:
data[len - 1]
You could be neurotic and parse each line, or look for the last tags: Content-Type:.
I would convert the C-style string into a C++ std::string then use the find_first method to look for the keywords.
The std::istringstream can be used to convert from text "0" to numeric 0.