Despite X-Spam-Status Scores above Required Spam isn't sorted out of mbox - procmail

I'm trying to run my mbox file through spamassassin with:
formail -s procmail ~/procmail.rc < mbox
Despite what I think looks like a proper procmail rc file & an ok spamassassin local.cf the mail that gets scored higher than my 'required' is not being filtered into my probably-spam folder.
Any spamassassin experts who can help?
This is on Ubuntu 16.04LTS
From my email header:
X-Spam-Status: No, score=-5.0 required=3.0 tests=RCVD_IN_DNSWL_HI,SPF_PASS,
T_RP_MATCHES_RCVD autolearn=unavailable autolearn_force=no version=3.4.1
My spamassassin local.cf:
rewrite_header Subject *****SPAM*****
report_safe 0
required_score 3.0
use_bayes 1
bayes_auto_learn 1
normalize_charset 1
ifplugin Mail::SpamAssassin::Plugin::Shortcircuit
shortcircuit BAYES_99 spam
shortcircuit BAYES_00 ham
endif # Mail::SpamAssassin::Plugin::Shortcircuit
My procmailrc:
:0fw: spamassassin.lock
* < 256000
| spamassassin
:0:
* ^X-Spam-Level: \*\*\*
almost-certainly-spam
:0:
* ^X-Spam-Status: Yes
probably-spam
# Work around procmail bug: any output on stderr will cause the "F" in
"From"
# to be dropped. This will re-add it.
:0
* ^^rom[ ]
{
LOG="*** Dropped F off From_ header! Fixing up. "
:0 fhw
| sed -e '1s/^/F/'
}

I cannot see any apparent mistakes in your spamassassin local.cf and procmailrc. But the example email header clearly says that it is no spam. The score is not higher, it is -5 which is lower than 3. Hence it says "X-Spam-Status: No".
Note that the procmail workaround should not be needed anymore. However, you might want to use -f -, i.e.
formail -s procmail -f - ~/procmail.rc < mbox

Related

How to write unix regular expression to select for specific files in a cp for-loop

I've got a directory with a bunch of files. Instead of describing the filenames and extensions, I'll just show you what is in the directory:
P01_1.atag P03_3.tgt P05_6.src P08_3.atag P10_5.tgt
P01_1.src P03_4.atag P05_6.tgt P08_3.src P10_6.atag
P01_1.tgt P03_4.src P06_1.atag P08_3.tgt P10_6.src
P01_2.atag P03_4.tgt P06_1.src P08_4.atag P10_6.tgt
P01_2.src P03_5.atag P06_1.tgt P08_4.src P11_1.atag
P01_2.tgt P03_5.src P06_2.atag P08_4.tgt P11_1.src
P01_3.atag P03_5.tgt P06_2.src P08_5.atag P11_1.tgt
P01_3.src P03_6.atag P06_2.tgt P08_5.src P11_2.atag
P01_3.tgt P03_6.src P06_3.atag P08_5.tgt P11_2.src
P01_4.atag P03_6.tgt P06_3.src P08_6.atag P11_2.tgt
P01_4.src P04_1.atag P06_3.tgt P08_6.src P11_3.atag
P01_4.tgt P04_1.src P06_4.atag P08_6.tgt P11_3.src
P01_5.atag P04_1.tgt P06_4.src P09_1.atag P11_3.tgt
P01_5.src P04_2.atag P06_4.tgt P09_1.src P11_4.atag
P01_5.tgt P04_2.src P06_5.atag P09_1.tgt P11_4.src
P01_6.atag P04_2.tgt P06_5.src P09_2.atag P11_4.tgt
P01_6.src P04_3.atag P06_5.tgt P09_2.src P11_5.atag
P01_6.tgt P04_3.src P06_6.atag P09_2.tgt P11_5.src
P02_1.atag P04_3.tgt P06_6.src P09_3.atag P11_5.tgt
P02_1.src P04_4.atag P06_6.tgt P09_3.src P11_6.atag
P02_1.tgt P04_4.src P07_1.atag P09_3.tgt P11_6.src
P02_2.atag P04_4.tgt P07_1.src P09_4.atag P11_6.tgt
P02_2.src P04_5.atag P07_1.tgt P09_4.src P12_1.atag
P02_2.tgt P04_5.src P07_2.atag P09_4.tgt P12_1.src
P02_3.atag P04_5.tgt P07_2.src P09_5.atag P12_1.tgt
P02_3.src P04_6.atag P07_2.tgt P09_5.src P12_2.atag
P02_3.tgt P04_6.src P07_3.atag P09_5.tgt P12_2.src
P02_4.atag P04_6.tgt P07_3.src P09_6.atag P12_2.tgt
P02_4.src P05_1.atag P07_3.tgt P09_6.src P12_3.atag
P02_4.tgt P05_1.src P07_4.atag P09_6.tgt P12_3.src
P02_5.atag P05_1.tgt P07_4.src P10_1.atag P12_3.tgt
P02_5.src P05_2.atag P07_4.tgt P10_1.src P12_4.atag
P02_5.tgt P05_2.src P07_5.atag P10_1.tgt P12_4.src
P02_6.atag P05_2.tgt P07_5.src P10_2.atag P12_4.tgt
P02_6.src P05_3.atag P07_5.tgt P10_2.src P12_5.atag
P02_6.tgt P05_3.src P07_6.atag P10_2.tgt P12_5.src
P03_1.atag P05_3.tgt P07_6.src P10_3.atag P12_5.tgt
P03_1.src P05_4.atag P07_6.tgt P10_3.src P12_6.atag
P03_1.tgt P05_4.src P08_1.atag P10_3.tgt P12_6.src
P03_2.atag P05_4.tgt P08_1.src P10_4.atag P12_6.tgt
P03_2.src P05_5.atag P08_1.tgt P10_4.src
P03_2.tgt P05_5.src P08_2.atag P10_4.tgt
P03_3.atag P05_5.tgt P08_2.src P10_5.atag
P03_3.src P05_6.atag P08_2.tgt P10_5.src
I have a file that is just outside of this directory that I need to copy to all of the files that end with "_1.src" inside the directory.
I'm working with unix in the Terminal app, so I tried writing this for loop, but it rejected my regular expression:
for .*1.src in ./
> do
> cp ../1.src
> done
I've only written regular expressions in Python before and have minimal experience, but I was under the impression that .* would match any combination of characters. However, I got the following error message:
-bash: `.*1.src': not a valid identifier
I then tried the same for loop with the following regular expression:
^[a-zA-Z0-9_]*1.src$
But I got the same error message:
-bash: `^[a-zA-Z0-9_]*1.src$': not a valid identifier
I tried the same regular expression with and without quotation marks, but it always gives the same 'not a valid identifier' error message.
Tested on Bash 4.4.12, the following is possible:
$ for i in ./*_1.src; do echo "$i" ; done
This will echo every file ending with _1.src to the screen, thus moving it will be possible as well.
$ mkdir tmp
$ for i in ./*_1.src; do mv "$i" tmp/.; done
I've tested with the following data:
$ touch P{1,2}{0,1,2}_{0..6}.{src,tgt,atag}
$ ls
P10_0.atag P10_5.src P11_3.tgt P12_2.atag P20_0.src P20_5.tgt P21_4.atag P22_2.src
P10_0.src P10_5.tgt P11_4.atag P12_2.src P20_0.tgt P20_6.atag P21_4.src P22_2.tgt
P10_0.tgt P10_6.atag P11_4.src P12_2.tgt P20_1.atag P20_6.src P21_4.tgt P22_3.atag
P10_1.atag P10_6.src P11_4.tgt P12_3.atag P20_1.src P20_6.tgt P21_5.atag P22_3.src
P10_1.src P10_6.tgt P11_5.atag P12_3.src P20_1.tgt P21_0.atag P21_5.src P22_3.tgt
P10_1.tgt P11_0.atag P11_5.src P12_3.tgt P20_2.atag P21_0.src P21_5.tgt P22_4.atag
P10_2.atag P11_0.src P11_5.tgt P12_4.atag P20_2.src P21_0.tgt P21_6.atag P22_4.src
P10_2.src P11_0.tgt P11_6.atag P12_4.src P20_2.tgt P21_1.atag P21_6.src P22_4.tgt
P10_2.tgt P11_1.atag P11_6.src P12_4.tgt P20_3.atag P21_1.src P21_6.tgt P22_5.atag
P10_3.atag P11_1.src P11_6.tgt P12_5.atag P20_3.src P21_1.tgt P22_0.atag P22_5.src
P10_3.src P11_1.tgt P12_0.atag P12_5.src P20_3.tgt P21_2.atag P22_0.src P22_5.tgt
P10_3.tgt P11_2.atag P12_0.src P12_5.tgt P20_4.atag P21_2.src P22_0.tgt P22_6.atag
P10_4.atag P11_2.src P12_0.tgt P12_6.atag P20_4.src P21_2.tgt P22_1.atag P22_6.src
P10_4.src P11_2.tgt P12_1.atag P12_6.src P20_4.tgt P21_3.atag P22_1.src P22_6.tgt
P10_4.tgt P11_3.atag P12_1.src P12_6.tgt P20_5.atag P21_3.src P22_1.tgt P10_5.atag
P11_3.src P12_1.tgt P20_0.atag P20_5.src P21_3.tgt P22_2.atag
Apparently, my previous answer didn't work. But this seems to:
$ for x in `echo ./P[01][012]_1.src`; do echo "$x"; done
./P01_1.src
./P02_1.src
So, when you run this echo alone, this pattern gets expanded into many names:
$ echo ./P[01][012]_1.src # note that the 'regex' is not enclosed in quotes
./P01_1.src ./P02_1.src
And then you can iterate over these names in a loop.
BTW, as noted in the comments, you don't even need that echo, so you can plug the pattern right into the loop:
for x in ./P[01][012]_1.src; do echo "$x"; done
Please correct me if your goal is something other than
"overwrite many existing files sharing a common suffix with the contents of a single file"
find /path/to/dest_dir -type f -name "*_1.src" |xargs -n1 cp /path/to/source_file
Note that without the -maxdepth 1 option, find will recurse through your destination directory.
Thanks to everyone; this is what ended up working:
for x in `echo ./P[0-9]*_1.src`
> do
> cp ../1.src "$x"
> done
This loop allowed me to copy the contents of the one file to all of the files in the subdirectory that ended with "_1.src"

bash script - fetch only unique domains from email list to variable

I am new to bash and having problem understanding how to get this done.
Check all "To:" field email address domains and list all unique domains to a variable to compare it to from domain.
I get the "from address" domain by using
grep -m 1 "From: " filename | cut -f 2 -d '#' | cut -d ">" -f 1
when reading a mail stored in file filename.
For "to address" domain there can be multiple To: addresses and having multiple domains. I am not sure how to get unique domains from "to address field".
Example to address line will be like this:
To: user#domain.com, user2#domain.com,
User Name <sample#domaintest.com>, test#domainname.com
grep -m 1 "^To: " filename | cut -f 2 -d '#' | cut -d ">" -f 1
but there are different format of email. So I am not sure if grep is right or if I should search for awk or something.
I need to get the unique domain list from the "To:" field email address/addresses to a variable in bash script.
Desired output for above example:
domain.com,domaintest.com,domainname.com
If you are hellbent on doing this with line-oriented utilities, there is a utility formail in the Procmail distribution which can normalize things for you somewhat.
bash$ formail -czxTo: <<\==test==
> From: me <sender#example.com>
> To: you <first#example.org>,
> them <other#example.net>
> Subject: quick demo
>
> Very quick, innit.
> ==test==
first#example.org, other#example.net
So with that you have input which you can actually pass to grep or Awk ... or sed.
fromdom=$(formail -czxTo: <message | tr ',' '\n' | sed 's/.*#//')
The From: address will not be normalized by formail -czxFrom: but you can use a neat trick: make formail generate a reply back to the From: address, and then extract the To: header from that.
todoms=$(formail -rtzcxTo: <message | sed 's/.*#//')
In some more detail, -r says to create a new reply to whoever sent you message, and then we do -zcxTo: on that.
(The -t option may or may not do what you want. In this case, I would perhaps omit it. http://www.iki.fi/era/procmail/formail.html has (vague) documentation for what it does; see also the section just before http://www.iki.fi/era/procmail/mini-faq.html#group-writable and sorry for the clumsy link -- there doesn't seem to be a good page-internal anchor to link to.)
Email address normalization is tricky because there are so many variants to choose from.
From: Elvis Parsley <king#graceland.example.com>
From: king#graceland.example.com
From: "Parsley, Elvis" <king#graceland.example.com> (kill me, I have to use Outlook)
From: "quoted#string" <king#graceland.example.com> (wait, he is already dead)
To: This could fold <recipient#example.net>,
over multiple lines <another#example.org>
I would turn to a more capable language with proper support for parsing all of these formats. My choice would be Python, though you could probably also pull this off in a few lines of Ruby or Perl.
The email library was revamped in Python 3.6 so this assumes you have at least that version. The email.Headerregistry class which is new in 3.6 is particularly convenient here.
#!/usr/bin/env python3
from email.policy import default
from email import message_from_binary_file
import sys
if len(sys.argv) == 1:
sys.argv.append('-')
for arg in sys.argv[1:]:
if arg == '-':
handle = sys.stdin
else:
handle = open(arg, 'rb')
message = message_from_binary_file(handle, policy=default)
from_dom = message.get('From').address.domain
to_doms = set()
for addr in message.get('To').addresses:
dom = addr.domain
if dom == from_dom:
continue
to_doms.add(dom)
print(','.join([from_dom] + list(to_doms)))
if arg != '-':
handle.close()
This simply produces a comma-separated list of domain names; you might want to do the rest of the processing in Python too instead, or change this so that it prints something in a slightly different format.
You'd save this in a convenient place (say, /usr/local/bin/fromto) and mark it as executable (chmod 755 /usr/local/bin/fromto). Now you can call this from the shell like any other utility like grep.

fabric: why can't I get local("history") to print out anything?

Here's my fabfile
from fabric.api import local, task
#task
def tracking(suffix=""):
buffer_ = "*" * 40
print (buffer_)
local("whoami")
print (buffer_)
local("env | grep dn")
#this one comes out empty...
print (buffer_)
out = local("history")
print (buffer_)
Everything prints out as expected, except for the history:
****************************************
[localhost] local: whoami
jluc
****************************************
[localhost] local: env | grep dn
dn_cb=/Users/jluc/.berkshelf/cookbooks
dn_cc=/Users/jluc/kds2/chef/chef-repo/cookbooks
dn_khtmldump=/Users/jluc/kds2/out/tests/dump2static
dn_cv=/Users/jluc/kds2/chef/vagrant/ubuntu2
****************************************
[localhost] local: history
****************************************
But nothing wrong with history on the command line...
history | tail -5
613 history
614 fab -f fabfile2.py tracking
615 history | tail -5
616 cls
617 history | tail -5
What gives? Adding shell="/bin/bash" didn't help either.
MacOs Sierra
According to the docs:
local is not currently capable of simultaneously printing and capturing output, as run/sudo do. The capture kwarg allows you to switch between printing and capturing as necessary, and defaults to False.
I'd interpret this as meaning if you want the history command to work, you need to capture the output first. Try changing all your local commands to include both shell="/bin/bash", and capture=True

Procmail: Move to folder and mark as read

a simple question:
I want to move emails with a certain subject to a folder and mark them as read afterwards. Moving works for me with
:0: H
* ^Subject:.*(ThisIsMySubject)
$HOME/mail/ThisIsMyFolder
But how to mark the mails as read?
Note: Updated dec. 16th 2011
Procmail solution
The following recipe works for me. .Junk is the spam folder:
MAILDIR=$HOME/Maildir
:0
* ^X-Spam-Flag: YES
{
# First deliver to maildir so LASTFOLDER gets set
:0 c
.Junk
# Manipulate the filename
:0 ai
* LASTFOLDER ?? ()\/[^/]+^^
|mv "$LASTFOLDER" "$MAILDIR/.Junk/cur/$MATCH:2,S"
}
Maildrop solution
Preface: Recently I had (no, I wanted) to do the same thing with a maildropfilter. After reading man maildropfilter I concocted the following recipe. I'm sure people will find this handy - I know I do.
The example below marks new emails as read but also unread old messages.
SPAMDIRFULL="$DEFAULT/.Junk"
if ( /^X-Spam-Flag: YES$/ || \
/^X-Spam-Level: \*\*\*/ || \
/^Subject: \*+SPAM\*/ )
{
exception {
cc "$SPAMDIRFULL"
`for x in ${SPAMDIRFULL}/new/*; do [ -f $x ] && mv $x ${SPAMDIRFULL}/cur/${x##*/}:2,S; done`
`for x in ${SPAMDIRFULL}/cur/*:2,; do [ -f $x ] && mv $x ${SPAMDIRFULL}/cur/${x##*/}S; done`
to "/dev/null"
}
}
Note that the exception command might read counterintuitive. The manual states the following:
The exception statement traps errors that would normally cause
maildrop to terminate. If a fatal error is encountered anywhere within
the block of statements enclosed by the exception clause, execution
will resume immediately following the exception clause.

How can I send an automated reply to the sender and all recipients with Procmail?

I'd like to create a procmail recipe or Perl or shell script that will send an auto response to the original sender as well as anybody that was copied (either To: or cc:) on the original email.
Example:
bob#example.com writes an email to john#example.com and paul#example.com (in the To: field). Copies are sent via cc: to rob#example.com and alice#example.com.
I'd like the script to send an auto response to the original sender (bob#example.com) and everybody else that was sent a copy of the email (john#example.com, paul#example.com, rob#example.com and alice#example.com).
Thanks
You should be able to accomplish this using the this procmail module for Perl 5. You could also just use the procmail configuration files to do this as well.
Here's an example of our procmail configuration sending e-mails "through" a perl script.
:0fw
* < 500000
| /etc/smrsh/decode_subject.pl
I hope that helps get ya started.
FROM=`formail -rtzxTo:`
CC=`formail -zxTo: -zxCc: | tr '\n' ,`
:0c
| ( echo To: "$FROM"; echo Cc: "$CC"; echo Subject: auto-reply; \
echo; echo Please ignore. ) \
| $SENDMAIL -oi -t
A well-formed auto-reply should set some additional headers etc; but this should hopefully be enough to get you started. See also http://porkmail.org/era/mail/autoresponder-faq.html
Depending on you flavor of tr you might need to encode the newline differently; not all implementations of tr understand the '\n' format. Try with '\012' or a literal newline in single quotes if you cannot get this to work.