Varnishlog log only specified IP - regex

I want to log varnish backend request which matches specified IP (for example 127.0.0.1).
So i have
"varnishlog -b -I BereqHeader:X-Forwarded-For: 127.0.0.1'"
Which actualy logs only the "BereqHeader:X-Forwarded-For:" part. I want to log full request, not only IP part.
That was first question, the second one is: how to disable loging empty request? I mean, if i have regex filter then i have a lot of request looking like this "* << BeReq >> 307454" and i obviously dont want to see them.

I have a solution. Log the data by
varnishlog -b -I BereqHeader:'X-Forwarded-For: 123.215.32.76' -i [other tags to log] > file.varnishlog
and then grep it by
cat file.varnishlog | grep -Pzo '* {3}<< BeReq {4}>>.\n- BereqHeader.+\n(-.\n)*'
which'll give us expected results.

Related

Grepping two patterns from event logs

I am seeking to extract timestamps and ip addresses out of log entries containing a varying amount of information. The basic structure of a log entry is:
<timestamp>, <token_1>, <token_2>, ... ,<token_n>, <ip_address> <token_n+2>, <token_n+3>, ... ,<token_n+m>,-
The number of tokens n between the timestamp and ip address varies considerably.
I have been studying regular expressions and am able to grep timestamps as follows:
grep -o "[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}T[0-9]\{2\}:[0-9]\{2\}:[0-9]\{2\}"
And ip addresses:
grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'
But I have not been able to grep both patterns out of log entries which contain both. Every log entry contains a timestamp, but not every entry contains an ip address.
Input:
2021-04-02T09:06:44.248878+00:00,Creation Time,EVT,WinEVTX,[4624 / 0x1210] Source Name: Microsoft-Windows-Security-Auditing Message string: An account was successfully logged on.\n\nSubject:\n\tSecurity ID:\t\tS-1-5-18\n\tAccount Name:\t\tREDACTED$\n\tAccount Domain:\t\tREDACTED\n\tLogon ID:\t\tREDACTED\n\nLogon Type:\t\t\t10\n\nNew Logon:\n\tSecurity ID:\t\tREDACTED\n\tAccount Name:\t\tREDACTED\n\tAccount Domain:\t\tREDACTED\n\tLogon ID:\t\REDACTED\n\tLogon GUID:\t\tREDACTED\n\nProcess Information:\n\tProcess ID:\t\tREDACTED\n\tProcess Name:\t\tC:\Windows\System32\winlogon.exe\n\nNetwork Information:\n\tWorkstation:\tREDACTED\n\tSource Network Address:\t255.255.255.255\n\tSource Port:\t\t0\n\nDetailed Authentication Information:\n\tLogon Process:\t\tUser32 \n\tAuthentication Package:\tNegotiate\n\tTransited Services:\t-\n\tPackage Name (NTLM only):\t-\n\tKey Length:\t\t0\n\nThis event is generated when a logon session is created. It is generated on the computer that was accessed.\n\nThe subject fields indicate the account on the local system which requested the logon. This is most commonly a service such as the Server service or a local process such as Winlogon.exe or Services.exe.\n\nThe logon type field indicates the kind of logon that occurred. The most common types are 2 (interactive) and 3 (network).\n\nThe New Logon fields indicate the account for whom the new logon was created i.e. the account that was logged on.\n\nThe network fields indicate where a remote logon request originated. Workstation name is not always available and may be left blank in some cases.\n\nThe authentication information fields provide detailed information about this specific logon request.\n\t- Logon GUID is a unique identifier that can be used to correlate this event with a KDC event.\n\t- Transited services indicate which intermediate services have participated in this logon request.\n\t- Package name indicates which sub-protocol was used among the NTLM protocols.\n\t- Key length indicates the length of the generated session key. This will be 0 if no session key was requested. Strings: ['S-1-5-18' 'DEVICE_NAME$' 'NETWORK' 'REDACTED' 'REDACTED' 'USERNAME' 'WORKSTATION' 'REDACTED' '10' 'User32 ' 'Negotiate' 'REDACTED' '{REDACTED}' '-' '-' '0' 'REDACTED' 'C:\\Windows\\System32\\winlogon.exe' '255.255.255.255' '0' '%%1833'] Computer Name: REDACTED Record Number: 1068355 Event Level: 0,winevtx,OS:REDACTED,-
Desired Output:
2021-04-02T09:06:44, 255.255.255.255
$ sed -En 's/.*([0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}).*[^0-9]([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*/\1, \2/p' file
2021-04-02T09:06:44, 255.255.255.255
Your regexps can be reduced by removing some of the explicit repetition though:
$ sed -En 's/.*([0-9]{4}(-[0-9]{2}){2}T([0-9]{2}:){2}[0-9]{2}).*[^0-9](([0-9]{1,3}\.){3}[0-9]{1,3}).*/\1, \4/p' file
2021-04-02T09:06:44, 255.255.255.255
It could be simpler still if all of the lines in your log file start with a timestamp:
$ sed -En 's/([^,.]+).*[^0-9](([0-9]{1,3}\.){3}[0-9]{1,3}).*/\1, \2/p' file
2021-04-02T09:06:44, 255.255.255.255
If you are looking for lines that contain both patterns, it may be easiest to do it two separate searches.
If you're searching your log file for lines that contain both "dog" and "cat", it's usually easiest to do this:
grep dog filename.txt | grep cat
The grep dog will find all lines in the file that match "dog", and then the grep cat will search all those lines for "cat".
You seem not to know the meaning of the "-o" switch.
Regular "grep" (without "-o") means: give the entire line where the pattern can be found. Adding "-o" means: only show the pattern.
Combining two "grep" in a logical AND-clause can be done using a pipe "|", so you can do this:
grep <pattern1> <filename> | grep <pattern2>

bash script - fetch only unique domains from email list to variable

I am new to bash and having problem understanding how to get this done.
Check all "To:" field email address domains and list all unique domains to a variable to compare it to from domain.
I get the "from address" domain by using
grep -m 1 "From: " filename | cut -f 2 -d '#' | cut -d ">" -f 1
when reading a mail stored in file filename.
For "to address" domain there can be multiple To: addresses and having multiple domains. I am not sure how to get unique domains from "to address field".
Example to address line will be like this:
To: user#domain.com, user2#domain.com,
User Name <sample#domaintest.com>, test#domainname.com
grep -m 1 "^To: " filename | cut -f 2 -d '#' | cut -d ">" -f 1
but there are different format of email. So I am not sure if grep is right or if I should search for awk or something.
I need to get the unique domain list from the "To:" field email address/addresses to a variable in bash script.
Desired output for above example:
domain.com,domaintest.com,domainname.com
If you are hellbent on doing this with line-oriented utilities, there is a utility formail in the Procmail distribution which can normalize things for you somewhat.
bash$ formail -czxTo: <<\==test==
> From: me <sender#example.com>
> To: you <first#example.org>,
> them <other#example.net>
> Subject: quick demo
>
> Very quick, innit.
> ==test==
first#example.org, other#example.net
So with that you have input which you can actually pass to grep or Awk ... or sed.
fromdom=$(formail -czxTo: <message | tr ',' '\n' | sed 's/.*#//')
The From: address will not be normalized by formail -czxFrom: but you can use a neat trick: make formail generate a reply back to the From: address, and then extract the To: header from that.
todoms=$(formail -rtzcxTo: <message | sed 's/.*#//')
In some more detail, -r says to create a new reply to whoever sent you message, and then we do -zcxTo: on that.
(The -t option may or may not do what you want. In this case, I would perhaps omit it. http://www.iki.fi/era/procmail/formail.html has (vague) documentation for what it does; see also the section just before http://www.iki.fi/era/procmail/mini-faq.html#group-writable and sorry for the clumsy link -- there doesn't seem to be a good page-internal anchor to link to.)
Email address normalization is tricky because there are so many variants to choose from.
From: Elvis Parsley <king#graceland.example.com>
From: king#graceland.example.com
From: "Parsley, Elvis" <king#graceland.example.com> (kill me, I have to use Outlook)
From: "quoted#string" <king#graceland.example.com> (wait, he is already dead)
To: This could fold <recipient#example.net>,
over multiple lines <another#example.org>
I would turn to a more capable language with proper support for parsing all of these formats. My choice would be Python, though you could probably also pull this off in a few lines of Ruby or Perl.
The email library was revamped in Python 3.6 so this assumes you have at least that version. The email.Headerregistry class which is new in 3.6 is particularly convenient here.
#!/usr/bin/env python3
from email.policy import default
from email import message_from_binary_file
import sys
if len(sys.argv) == 1:
sys.argv.append('-')
for arg in sys.argv[1:]:
if arg == '-':
handle = sys.stdin
else:
handle = open(arg, 'rb')
message = message_from_binary_file(handle, policy=default)
from_dom = message.get('From').address.domain
to_doms = set()
for addr in message.get('To').addresses:
dom = addr.domain
if dom == from_dom:
continue
to_doms.add(dom)
print(','.join([from_dom] + list(to_doms)))
if arg != '-':
handle.close()
This simply produces a comma-separated list of domain names; you might want to do the rest of the processing in Python too instead, or change this so that it prints something in a slightly different format.
You'd save this in a convenient place (say, /usr/local/bin/fromto) and mark it as executable (chmod 755 /usr/local/bin/fromto). Now you can call this from the shell like any other utility like grep.

Bash Script: sed/awk/regex to match an IP address and replace

I have a string in a bash script that contains a line of a log entry such as this:
Oct 24 12:37:45 10.224.0.2/10.224.0.2 14671: Oct 24 2012 12:37:44.583 BST: %SEC_LOGIN-4-LOGIN_FAILED: Login failed [user: root] [Source: 10.224.0.58] [localport: 22] [Reason: Login Authentication Failed] at 12:37:44 BST Wed Oct 24 2012
To clarify; the first IP listed there "10.224.0.2" was the machine the submitted this log entry, of a failed login attempt. Someone tried to log in, and failed, from the machine at the 2nd IP address in the log entry, "10.224.0.58".
I wish to replace the first occurrence of the IP address "10.224.0.2" with the host name of that machine, as you can see presently is is "IPADDRESS/IPADDRESS" which is useless having the same info twice. So here, I would like to grep (or similar) out the first IP and then pass it to something like the host command to get the reverse host and replace it in the log output.
I would like to repeat this for the 2nd IP "10.224.0.58". I would like to find this IP and also replace it with the host name.
It's not just those two specific IP address though, any IP address. So I want to search for 4 integers between 1 and 3, separated by 3 full stops '.'
Is regex the way forward here, or is that over complicating the issue?
Many thanks.
Replace a fixed IP address with a host name:
$ cat log | sed -r 's/10\.224\.0\.2/example.com/g'
Replace all IP addresses with a host name:
$ cat log | sed -r 's/[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/example.com/g'
If you want to call an external program, it's easy to do that using Perl (just replace host with your lookup tool):
$ cat log | perl -pe 's/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/`host \1`/ge'
Hopefully this is enough to get you started.
There's variou ways to find th IP addresses, here's one. Just replace "printf '<<<%s>>>' " with "host" or whatever your command name is in this GNU awk script:
$ cat tst.awk
{
subIp = gensub(/\/.*$/,"","",$4)
srcIp = gensub(/.*\[Source: ([^]]+)\].*/,"\\1","")
"printf '<<<%s>>>' " subIp | getline subName
"printf '<<<%s>>>' " srcIp | getline srcName
gsub(subIp,subName)
gsub(srcIp,srcName)
print
}
$
$ gawk -f tst.awk file
Oct 24 12:37:45 <<<10.224.0.2>>>/<<<10.224.0.2>>> 14671: Oct 24 2012 12:37:44.583 BST: %SEC_LOGIN-4-LOGIN_FAILED: Login failed [user: root] [Source: <<<10.224.0.58>>>] [localport: 22] [Reason: Login Authentication Failed] at 12:37:44 BST Wed Oct 24 2012
googled this one line command together. but was unable to pass the founded ip address to the ssh command:
sed -n 's/\([0-9]\{1,3\}\.\)\{3\}[0-9]\{1,3\}/\nip&\n/gp' test | grep ip | sed 's/ip//' | sort | uniq
the "test" is the file the sed command is searching for for the pattern

Retrieve the unique websites from an url list

I have a list with a lot of page urls. I want to retrieve the unique websites.
"http://www.gadgetgiants.com/products/mica-8-inch-touchscreen-android-2-3-tablet-wifi-1-2ghz-cpu-flash10-3"
"http://www.malma.mx/products/pan-digital"
"http://www.gadgetgiants.com/products/snowpad-7-capacitive-multi-touch-screen-android-2-3-tabletwifi-samsung-cortex-a8-1-2ghz-cpu-camera-1080p-external-3g"
"http://www.spiritualityandwellness.com/products/internalized-motivation"
"http://www.spiritualityandwellness.com/products/evergreen-motivation"
Will result to:
www.gadgetgiants.com
www.malma.mx
www.spiritualityandwellness.com
egrep -o "www\.[a-zA-Z0-9.-]*\.[a-zA-Z]{2,4}" YOUR_FILE_NAME | sort -u
got the regex from here
(Edit) Example Usage and Output
$ cat ur.txt
"http://www.gadgetgiants.com/products/mica-8-inch-touchscreen-android-2-3"
"http://www.malma.mx/products/pan-digital"
"http://www.gadgetgiants.com/products/snowpad-7-capacitive-multi-touch"
"http://www.spiritualityandwellness.com/products/internalized-motivation"
"http://www.spiritualityandwellness.com/products/evergreen-motivation"
"http://www.swellness.com.au/products/evergreen-motivation"
$ egrep -o "www\.[a-zA-Z0-9.-]*\.[a-zA-Z]{2,4}" ur.txt | sort -u
www.gadgetgiants.com
www.malma.mx
www.spiritualityandwellness.com
www.swellness.com.au
Idea w/o regex:
Retrieve host from each address:
Uri uri = new Uri (yourLink);
string host = uri.Host;
Now you can just put all these hosts into HashSet or something.

How can I send an automated reply to the sender and all recipients with Procmail?

I'd like to create a procmail recipe or Perl or shell script that will send an auto response to the original sender as well as anybody that was copied (either To: or cc:) on the original email.
Example:
bob#example.com writes an email to john#example.com and paul#example.com (in the To: field). Copies are sent via cc: to rob#example.com and alice#example.com.
I'd like the script to send an auto response to the original sender (bob#example.com) and everybody else that was sent a copy of the email (john#example.com, paul#example.com, rob#example.com and alice#example.com).
Thanks
You should be able to accomplish this using the this procmail module for Perl 5. You could also just use the procmail configuration files to do this as well.
Here's an example of our procmail configuration sending e-mails "through" a perl script.
:0fw
* < 500000
| /etc/smrsh/decode_subject.pl
I hope that helps get ya started.
FROM=`formail -rtzxTo:`
CC=`formail -zxTo: -zxCc: | tr '\n' ,`
:0c
| ( echo To: "$FROM"; echo Cc: "$CC"; echo Subject: auto-reply; \
echo; echo Please ignore. ) \
| $SENDMAIL -oi -t
A well-formed auto-reply should set some additional headers etc; but this should hopefully be enough to get you started. See also http://porkmail.org/era/mail/autoresponder-faq.html
Depending on you flavor of tr you might need to encode the newline differently; not all implementations of tr understand the '\n' format. Try with '\012' or a literal newline in single quotes if you cannot get this to work.