Intregrate existing Procmail with with new SpamAssassin

Intregrate existing Procmail with with new SpamAssassin - procmail

For many, many years, I have successfully been using procmail and it's recipes without issue, and I have many, many recipes.
A few weeks ago, my system adopted/started using spamassassin, and now those procmail recipes that have been used for years and years have stopped working.
I am but a regular user on the system, and the system administrator (and such) are not available for assistance.
Can someone tell me what I need to do to fix my procmail (or spamassassin), so it works like it did before? Before, it would place email with "SPAM" into a spam folder and various mailing lists into their own mailboxes. Now, it just marks spam as "***SPAM***" and my mailing lists remain in my inbox.
Any help, links, etc. are appreciated.
From my procmail.log file
procmail: [6769] Sun Jun 21 22:43:23 2015
procmail: Assigning "JFDIR=/arpa/tz/z/zaxxon/.junkfilter/junkfilter"
procmail: Assigning "JFUSERDIR=/arpa/tz/z/zaxxon/.junkfilter/junkfilter/user_bloo
cklist"
procmail: Assigning "FROM=^(From[ ]|(Old-|X-)?(Resent-)?(From|Reply-To|Sender)::
)(.*\<)?"
procmail: No match on "^Subject: Zaxxon envdump$"
procmail: Match on "< 256000"
procmail: Locking "spamassassin.lock"
procmail: Executing "spamassassin"
/bin/sh: Can't open spamassassin
procmail: Error while writing to "spamassassin"
procmail: Rescue of unfiltered data succeeded
procmail: Unlocking "spamassassin.lock"
procmail: No match on "^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*"
procmail: Match on "^X-Spam-Status: Yes"
procmail: Locking ".lock"
procmail: Assigning "LASTFOLDER="
procmail: Opening ""
procmail: Error while writing to ""
procmail: Unlocking ".lock"
procmail: No match on "^^rom[ ]"
procmail: No match on "^(From[ ]|(Old-|X-)?(Resent-)?(From|Reply-To|Sender)::
)(.*\<)?.*(facebook|pottermore|mangafox).*"
procmail: No match on "^(From[ ]|(Old-|X-)?(Resent-)?(From|Reply-To|Sender)::
)(.*\<)?.*(facebook|pottermore|mangafox).*"
procmail: No match on "^(From[ ]|(Old-|X-)?(Resent-)?(From|Reply-To|Sender)::
)(.*\<)?.*(archiveofourown|ficwad|tthfanfic|fanficauthors|sufficientvelocity).*"
procmail: No match on "^(From[ ]|(Old-|X-)?(Resent-)?(From|Reply-To|Sender)::
)(.*\<)?.*(archiveofourown|ficwad|tthfanfic|fanficauthors|sufficientvelocity).*"
procmail: No match on "^(From[ ]|(Old-|X-)?(Resent-)?(From|Reply-To|Sender)::
)(.*\<)?.*(empornium|pornhub|tumblr).*"
procmail: No match on "^(From[ ]|(Old-|X-)?(Resent-)?(From|Reply-To|Sender)::
)(.*\<)?.*(empornium|pornhub|tumblr).*"
procmail: Match on "^(From[ ]|(Old-|X-)?(Resent-)?(From|Reply-To|Sender):)(.*\\
<)?.*(sdf\.org|lastpass\.com|xmarks\.com).*"
procmail: Locking "/var/mail/zaxxon.lock"
procmail: Assigning "LASTFOLDER=/var/mail/zaxxon"
procmail: Opening "/var/mail/zaxxon"
procmail: Acquiring kernel-lock
procmail: Unlocking "/var/mail/zaxxon.lock"
From stephaniewilson#ambertuild.biz Sun Jun 21 22:43:18 2015
Subject: *****SPAM***** Is Alcohol Controling Your Life?
Folder: /var/mail/zaxxon 20780
The spam rule is
:0:
* ^Subject:.*[Ss][Pp][Aa][Mm].*
junkmail

procmail: Match on "< 256000"
procmail: Locking "spamassassin.lock"
procmail: Executing "spamassassin"
/bin/sh: Can't open spamassassin
procmail: Error while writing to "spamassassin"
procmail: Rescue of unfiltered data succeeded
procmail: Unlocking "spamassassin.lock"
This tells you that you have a rule which is pretty much exactly
:0fw:spamassassin.lock
* < 256000
| spamassassin
but there is no binary named spamassassin on the system where this recipe runs, so it fails.
The following "error writing to" is harder to diagnose, but might look something like
:0
which of course makes no sense.
The regex for the list headers appears to have a typo -- no legitimate emails will have headers with two adjacent colon characters. The value in the FROM= assignment should only have a single colon.
As a general stylistic remark, a trailing .* on a (non-capturing) regular expression is always redundant.
Diagnosing these problems without access to the faulty .procmailrc is challenging. If you still need help, definitely take care to include the actual code you are having problems with, as clearly described in the help section. Questions without the actual problematic code are likely to get downvoted and/or closed.

Related

Python regex match n lines after match

I have a pattern that works fine on regexr.com with pcre but when I use it with python it doesn't match anything.
the pattern is:
.*(?<=RSA SHA256:).*(?:.*\n){3}.*
and it matches the data on the website but when I run this on my python script it doesn't.
The goal is to match Accepted publickey and the next 3 lines.
Thank you!
script below:
import re
Accepted_publickey=r'.*(?<=RSA SHA256:).*(?:.*\n){3}.*'
file=open('secure')
for items in file:
re1=re.search(Accepted_publickey,items)
if re1:
print(re1.group())
The actual data is:
Oct 21 17:27:21 localhost sshd[19772]: Accepted publickey for vagrant from 192.168.2.140 port 54614 ssh2: RSA SHA256:uDsE4ecSD9ElWQ5Q0fdMsbqEzOe0Hszilv8xhU6dT6M
Oct 21 17:27:22 localhost sshd[19772]: pam_unix(sshd:session): session opened for user vagrant by (uid=0)
Oct 21 17:27:22 localhost sshd[19772]: User child is on pid 19774
Oct 21 17:27:22 localhost sshd[19774]: Starting session: shell on pts/2 for vagrant from 192.168.2.140 port 54614 id 0

You don't have to use a lookbehind, you could match the value.
To match the 3 following lines, you could switch the newline and .* to omit the last .*
^.*\bRSA SHA256:.*(?:\n.*){3}
^ Start of string
.*\bRSA SHA256:.* Match RSA SHA256: in the string preceded by a word boundary
(?:\n.*){3} repeat 3 times a newline followed by matching any char except a newline 3 times
Regex demo
In your code you might use read():
import re
Accepted_publickey = r'^.*RSA SHA256:.*(?:.*\n){3}.*'
f = open('secure')
items = f.read()
re1 = re.search(Accepted_publickey, items, re.M)
if re1:
print(re1.group())

Using Regular Expressions in Redshift to get the word prior to matched pattern

Using Regexp_substring() find the word just preceding the word 'OF' (1st occurrence). The below code is not working as Redshift does not seem to support non-greedy pattern matching.
Please help
select regexp_substr('SAFETY COUNCIL OF PALM BEACH COUNTY, INC. ','[[:print:]].*?\\sOF\\s')
Query execution failed
Reason:
SQL Error [XX000]: ERROR: Invalid preceding regular expression prior to repetition operator. The error occurred while parsing the regular expression fragment: 'rint:]].*?>>>HERE>>>\sOF\s'.
Detail:
-----------------------------------------------
error: Invalid preceding regular expression prior to repetition operator. The error occurred while parsing the regular expression fragment: 'rint:]].*?>>>HERE>>>\sOF\s'.
code: 8002
context: T_regexp_init
query: 0
location: funcs_expr.cpp:189
process: padbmaster [pid=74292]
-----------------------------------------------
Where: SQL function "regexp_substr" statement 1
I am currently using this approach which is shabby and believe there should be a better approach
select 'SAFETY OF COUNCIL OF PALM OF BEACH COUNTY, INC. ' as name, regexp_instr(name,'\\sOF\\s',1) as ind1,substr(name,1,ind1-1) as name_2,regexp_replace(name_2,regexp_substr(name_2,'.*\\s'),'')

To achieve this functionality I usually use split_part function. Same works in Postgresql.
select split_part('SAFETY COUNCIL OF PALM BEACH COUNTY, INC. ', 'OF',1)

How to fix problem when find all text matches regular expression in perl?

I want to get all text match with date regex on given string but i don't get expected result
i have string like this "Lomba lari akan diadakan pada tanggal 15 Agustus 2019"
#dates2 = $line =~ m/(\d{1,2}\s(Januari|Februari|Maret|April|Mei|Juni|Juli|Agustus|September|Oktober|November|Desember)\s\d{4})/g;
$length2 = #dates2;
print "#dates2\n";
print "Length 2 : $length2\n";
$date_occurence += $length2;
i want to get only "15 Agustus 2019" in array dates2 but i got "15 Agustus 2019" and "Agustus". Anyone can tell me how match regex code work ?

You're getting Agustus in the output because of the month alternation (Januari|...|Desember) which is a capturing group. To remove it, just make your internal alternation non-capturing i.e.
(?:Januari|Februari|Maret|April|Mei|Juni|Juli|Agustus|September|Oktober|November|Desember)

Regex to pick out correct time in powershell depending on what is next to it

I am using powershell with regex to try and extract the following time from the line below "01:42:35". However I want to ignore the time "02:42:35" but I am unsure of how to do it.
2013-07-04 02:42:35 Alert 172.172.19.9 Jul 4 01:42:35 ...
Currently I am using this time regex: $time_regex = "(\d+):(\d+):(\d+)"
How can I adapt this to the above specification?
Note: the time i am trying to get is not at the end of the line and the second time always has a date next to it in the format "Jul 4 " whereas the first time has a date next to it in the format "2013-07-04"
Thanks

$time_regex = "(?<=\w+ \d+ )(\d+):(\d+):(\d+)"
will only match a time string that's preceded by an alphanumeric "word" and a number.

If is always at the end of the line use:
$t = "2013-07-04 02:42:35 Alert 172.172.19.9 Jul 4 01:42:35"
[regex]::match( $t, "(\d+:){2}(\d+)$" ) | select -expa value
Edit after comment:
try this:
$time_regex = "(?<= \d+ )(\d+:){2}\d+"

RegEx to match specific sentence plus date and time

I've tried figuring out how to make regex match something specific follow by a date and a time. I cannot for the life of me figure it out!
I want to match the following sentence, where the date and time of course may be random:
Den 25/01/2013 kl. 14.03 skrev
So it should match like this: Den dd/mm/yyyy kl. hh.mm skrev
Note that time is in 24-hour format.
Can anyone help here? I can easily find an example that matches a date or time, but I don't know how to combine it with this specific sentence :(
Thanks in advance

Use it just by combining them as:
Den (0[1-9]|[12][0-9]|3[01])/(0[1-9]|1[0-2])/(0{3}[1-9]|((?!0{3}\d)\d{4})) kl\. ([01][0-9]|[2[0-3])\.([0-5][0-9]) skrev
Note : Date not validated properly. Will match 30/02/2000
Den matches Den as such.
(0[1-9]|[12][0-9]|3[01])/(0[1-9]|1[0-2])/\d{3}[1-9] matches date. 0{3}[1-9]|((?!0{3}\d)\d{4}) avoids 0000 as year.
kl\. matches kl. The \ before the . is to escape . which is a special character
in regex
([01][0-9]|[2[0-3])\.([0-5][0-9]) matches time from 00.00 to 23.59
skrev matches skrev as such.
The following validates date a bit more well
Den ((0[1-9]|[12][0-9]|3[01])/(?=(0[13578]|1[02]))(0[13578]|1[02])|(0[1-9]|[12][0-9]|30)/(?=(0[469]|11))(0[469]|11)|(0[1-9]|[12][0-9])/(?=(02))(02))/(0{3}[1-9]|((?!0{3}\d)\d{4})) kl\. ([01][0-9]|[2[0-3])\.([0-5][0-9]) skrev
Still matches 29/02/1999 - No validation for leap year or not
To match single digit days and months also, replace the date part with the following:
(0?[1-9]|[12][0-9]|3[01])/(0?[1-9]|1[0-2])/(0{3}[1-9]|((?!0{3}\d)\d{4}))
The ? makes the preceding part optional i.e. the 0 becomes optional.

`Den ([0-3]\d)/([0-1]\d)/(\d{4}) kl\. ([0-2]\d)\.([0-5]\d) skrev`
to catch the values in order to facilitate validation.

maybe not the smartest solution, but this expression should fit your request:
Den\s[0-3][0-9]/[0-1][[0-9]/[0-9][0-9][0-9][0-9]\skl.\s[0-2][0-9].[0-6][0-9]\sskrev

Turns out all I actually needed was:
(Den ../../.... kl. ..... skrev)
Since . just matches random characters, and because this sentence is auto-generated by an e-mail client, there is no need to actually validate if it's a date, but merely look for this pattern and discard everything after. Nobody would ever write that so specifically in the middle of regular text.
In case anybody is wondering, this is for SpiceWorks reply header filtering.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Intregrate existing Procmail with with new SpamAssassin - procmail

Related

Python regex match n lines after match

Using Regular Expressions in Redshift to get the word prior to matched pattern

How to fix problem when find all text matches regular expression in perl?

Regex to pick out correct time in powershell depending on what is next to it

RegEx to match specific sentence plus date and time

Categories

Resources