python imaplib - error 'unexpected repsonse' - python-2.7

Hi I've suddenly encountered an error using imaplib on some code that worked fine before.
import imaplib
m = imaplib.IMAP4('myserver','port')
m.login(r'username','password')
m.select()
gives me the error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/imaplib.py", line 649, in select
typ, dat = self._simple_command(name, mailbox)
File "/usr/lib/python2.7/imaplib.py", line 1070, in _simple_command
return self._command_complete(name, self._command(name, *args))
File "/usr/lib/python2.7/imaplib.py", line 899, in _command_complete
raise self.abort('command: %s => %s' % (name, val))
imaplib.abort: command: SELECT => unexpected response: '* 1520 EXISTS'
I'm not sure what it means. Emails are otherwise coming through fine, and I'm using davmail as a server.
The program in its entirety saves attachments with a certain name in a specific folder.
I've stepped through it and its definitely the m.select() that is where its falling over.
This same program worked absolutely fine until recently.
What am I doing wrong, and how do I fix it?
The log of activity is as follows
>>> import imaplib
>>> m = imaplib.IMAP4('server','port')
>>> Debug=4
>>> m.debug
0
>>> m.debug=4
>>> m.debug
4
>>> m.login(r'username','password')
01:26.55 > HLFI1 LOGIN "username" "password"
01:30.76 < HLFI1 OK Authenticated
('OK', ['Authenticated'])
>>> m.list()
01:56.33 > HLFI2 LIST "" *
02:00.04 < * LIST (\HasNoChildren) "/" "Trash/Sent Messages"
02:00.04 < * LIST (\HasNoChildren) "/" "Sync Issues/Server Failures"
02:00.04 < * LIST (\HasNoChildren) "/" "Sync Issues/Local Failures"
02:00.04 < * LIST (\HasNoChildren) "/" "Sync Issues/Conflicts"
02:00.04 < * LIST (\HasChildren) "/" "Sync Issues"
02:00.04 < * LIST (\HasNoChildren) "/" "Junk E-mail"
02:00.04 < * LIST (\HasNoChildren) "/" "Drafts"
02:00.04 < * LIST (\HasChildren) "/" "Trash"
02:00.04 < * LIST (\HasNoChildren) "/" "Sent"
02:00.04 < * LIST (\HasNoChildren) "/" "Outbox"
02:00.04 < * LIST (\HasNoChildren) "/" "INBOX"
02:00.04 < HLFI2 OK LIST completed
('OK', ['(\\HasNoChildren) "/" "Trash/Sent Messages"', '(\\HasNoChildren) "/" "Sync Issues/Server Failures"', '(\\HasNoChildren) "/" "Sync Issues/Local Failures"', '(\\HasNoChildren) "/" "Sync Issues/Conflicts"', '(\\HasChildren) "/" "Sync Issues"', '(\\HasNoChildren) "/" "Junk E-mail"', '(\\HasNoChildren) "/" "Drafts"', '(\\HasChildren) "/" "Trash"', '(\\HasNoChildren) "/" "Sent"', '(\\HasNoChildren) "/" "Outbox"', '(\\HasNoChildren) "/" "INBOX"'])
>>> m.select()
02:21.37 > HLFI3 SELECT INBOX
02:30.87 < * 1548 EXISTS
02:30.87 last 4 IMAP4 interactions:
00:16.73 < * OK [CAPABILITY IMAP4REV1 AUTH=LOGIN MOVE] IMAP4rev1 DavMail 4.3.0-2125 server ready
00:16.73 > HLFI0 CAPABILITY
00:16.74 < * CAPABILITY IMAP4REV1 AUTH=LOGIN MOVE
00:16.77 < HLFI0 OK CAPABILITY completed
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/imaplib.py", line 649, in select
typ, dat = self._simple_command(name, mailbox)
File "/usr/lib/python2.7/imaplib.py", line 1070, in _simple_command
return self._command_complete(name, self._command(name, *args))
File "/usr/lib/python2.7/imaplib.py", line 899, in _command_complete
raise self.abort('command: %s => %s' % (name, val))
imaplib.abort: command: SELECT => unexpected response: '* 1548 EXISTS'
** UPDATE **
I've now filed a bug at python-dev under
Bug report on Python
Where David Murray says the response is non RFC compliant
And a second at davmail sourceforge under
davmail bug report
M guessant says it is necessary for the IMAP keep alive.
I'll keep this updated with developments..

It appears that the space-padded message count in the RECENT response is what triggers this. It is unclear to me whether it should be classified as an error in Python's imaplib or in the IMAP server you are using. I would argue that imaplib should be robust against this, regardless of what the spec says. Perhaps you should file a bug report?
(If you do, please take care to add details about which server is producing this response. If it is a commercial product with a respectable market share, it is important to fix, whereas of course, if it's your own simple Python server, they might not care.)

Found the response of the EXISTS command is padded by davmail, it seems to be when the number of emails is around or over 500.
Acceptable response:
58:24.54 > PJJD3 EXAMINE INBOX
58:24.77 < * 486 EXISTS
58:24.78 matched r'\* (?P<data>\d+) (?P<type>[A-Z-]+)( (?P<data2>.*))?' => ('486', 'EXISTS', None, None)
58:24.78 untagged_responses[EXISTS] 0 += ["486"]
Failed response:
57:50.86 > KPFE3 EXAMINE INBOX
57:51.10 < * 953 EXISTS
57:51.10 last 0 IMAP4 interactions:
57:51.10 > KPFE4 LOGOUT
Have create a pull request for the imaplib library at github to account for the issue
To patch your code the imaplib allows the regex to be altered, simply add the following to your code:
imaplib.Untagged_status = imaplib.re.compile(br'\*[ ]{1,2}(?P<data>\d+) (?P<type>[A-Z-]+)( (?P<data2>.*))?')

Related

ExportDicomData request of Google Cloud Healthcare API on GitHub tutorials never finish

I'm trying AutoML Vision of ML Codelabs on Cloud Healthcare API GitHub tutorials.
https://github.com/GoogleCloudPlatform/healthcare/blob/master/imaging/ml_codelab/breast_density_auto_ml.ipynb
I run the Export DICOM data cell code of Convert DICOM to JPEG section and the request as well as all the premise cell code succeeded.
But waiting for operation completion is timed out and never finish.
(ExportDicomData request status on Dataset page stays "Running" over the day. I did many times but all the requests were stacked staying "Running". A few times I tried to do from scratch and the results were same.)
I did so far:
1) Remove "output_config" since INVALID ARGUMENT error occurs.
https://github.com/GoogleCloudPlatform/healthcare/issues/133
2) Enable Cloud Resource Manager API since it is needed.
This is the cell code.
# Path to export DICOM data.
dicom_store_url = os.path.join(HEALTHCARE_API_URL, 'projects', project_id, 'locations', location, 'datasets', dataset_id, 'dicomStores', dicom_store_id)
path = dicom_store_url + ":export"
# Headers (send request in JSON format).
headers = {'Content-Type': 'application/json'}
# Body (encoded in JSON format).
# output_config = {'output_config': {'gcs_destination': {'uri_prefix': jpeg_folder, 'mime_type': 'image/jpeg; transfer-syntax=1.2.840.10008.1.2.4.50'}}}
output_config = {'gcs_destination': {'uri_prefix': jpeg_folder, 'mime_type': 'image/jpeg; transfer-syntax=1.2.840.10008.1.2.4.50'}}
body = json.dumps(output_config)
resp, content = http.request(path, method='POST', headers=headers, body=body)
assert resp.status == 200, 'error exporting to JPEG, code: {0}, response: {1}'.format(resp.status, content)
print('Full response:\n{0}'.format(content))
# Record operation_name so we can poll for it later.
response = json.loads(content)
operation_name = response['name']
This is the result of waiting.
Waiting for operation completion...
Full response:
{
"name": "projects/my-datalab-tutorials/locations/us-central1/datasets/sample-dataset/operations/18300485449992372225",
"metadata": {
"#type": "type.googleapis.com/google.cloud.healthcare.v1beta1.OperationMetadata",
"apiMethodName": "google.cloud.healthcare.v1beta1.dicom.DicomService.ExportDicomData",
"createTime": "2019-08-18T10:37:49.809136Z"
}
}
AssertionErrorTraceback (most recent call last)
<ipython-input-18-1a57fd38ea96> in <module>()
21 timeout = time.time() + 10*60 # Wait up to 10 minutes.
22 path = os.path.join(HEALTHCARE_API_URL, operation_name)
---> 23 _ = wait_for_operation_completion(path, timeout)
<ipython-input-18-1a57fd38ea96> in wait_for_operation_completion(path, timeout)
15
16 print('Full response:\n{0}'.format(content))
---> 17 assert success, "operation did not complete successfully in time limit"
18 print('Success!')
19 return response
AssertionError: operation did not complete successfully in time limit
API Version is v1beta1.
I was wondering if somebody has any suggestion.
Thank you.
After several times kept trying and stayed running one night, it finally succeeded. I don't know why.
There was a recent update to the codelabs. The error message is due to the timeout in the codelab and not the actual operation. This has been addressed in the update. Please let me know if you are still running into any issues!

EndpointConnectionError: Could not connect to the endpoint URL: "http://169.254.169.254/....."

I am trying to create AWS RDS and deploy lambda function using a python script. However, I am getting below error, looks like it is unable to communicate with the aws commands to create rds.
DEBUG: Caught retryable HTTP exception while making metadata service request to http://169.254.169.254/latest/meta-data/iam/security-credentials/: Could not connect to the endpoint URL: "http://169.254.169.254/latest/meta-data/iam/security-credentials/"
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/botocore/utils.py", line 303, in _get_request
response = self._session.send(request.prepare())
File "/usr/lib/python2.7/site-packages/botocore/httpsession.py", line 282, in send raise EndpointConnectionError(endpoint_url=request.url, error=e)
EndpointConnectionError: Could not connect to the endpoint URL: "http://169.254.169.254/latest/meta-data/iam/security-credentials/"
I am getting the aws credentials through SSO okta. In the ~/.aws directory,below are the contents of 'credentials' and 'config' file respectively.
[default]
aws_access_key_id = <Key Id>
aws_secret_access_key = <Secret Key>
aws_session_token = <Token>
[default]
region = us-west-2
```python
```
for az in availability_zones:
if aurora.get_db_instance(db_instance_identifier + "-" + az)[0] != 0:
aurora.create_db_instance(db_cluster_identifier, db_instance_identifier + "-" + az, az, subnet_group_identifier, db_instance_type)
else:
aurora.modify_db_instance(db_cluster_identifier, db_instance_identifier + "-" + az, az, db_instance_type)
# Wait for DB to become available for connection
iter_max = 15
iteration = 0
for az in availability_zones:
while aurora.get_db_instance(db_instance_identifier + "-" + az)[1]["DBInstances"][0]["DBInstanceStatus"] != "available":
iteration += 1
if iteration < iter_max:
logging.info("Waiting for DB instances to become available - iteration " + str(iteration) + " of " + str(iter_max))
time.sleep(10*iteration)
else:
raise Exception("Waiting for DB Instance to become available timed out!")
cluster_endpoint = aurora.get_db_cluster(db_cluster_identifier)[1]["DBClusters"][0]["Endpoint"]
The actual error below, coming from the while loop, DEBUG shows unable to locate credential, but the credential is there. I can deploy an Elastic Beanstalk environment from cli using the same aws credential, but not this. Looks like the above aurora.create_db_instance command failed.
DEBUG: Unable to locate credentials
Traceback (most recent call last):
File "./deploy_api.py", line 753, in <module> sync_rds()
File "./deploy_api.py", line 57, in sync_rds
while aurora.get_db_instance(db_instance_identifier + "-" + az)[1]["DBInstances"][0]["DBInstanceStatus"] != "available":
TypeError: 'NoneType' object has no attribute '__getitem__'
I had this error because an ECS task didn't have permissions to write to DynamoDB. The code causing the problem was:
from boto3 import resource
dynamodb_resource = resource("dynamodb")
The problem was resolved when I filled in the region_name, aws_access_key_id and aws_secret_access_key parameters for the resource() function call.
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session.resource
If this doesn't solve your problem then check your code that connects to AWS services and make sure that you are filling in all of the proper function parameters.

Delete AWS Log Groups that haven't been touched in X days

Is there a way to delete all AWS Log Groups that haven't had any writes to them in the past 30 days?
Or conversely, get the list of log groups that haven't had anything written to them in the past 30 days?
Here is some quick script I wrote:
#!/usr/bin/python3
# describe log groups
# describe log streams
# get log groups with the lastEventTimestamp after some time
# delete those log groups
# have a dry run option
# support profile
# https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/logs.html#CloudWatchLogs.Client.describe_log_streams
# https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/logs.html#CloudWatchLogs.Client.describe_log_groups
# https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/logs.html#CloudWatchLogs.Client.delete_log_group
import boto3
import time
millis = int(round(time.time() * 1000))
delete = False
debug = False
log_group_prefix='/' # NEED TO CHANGE THESE
days = 30
# Create CloudWatchLogs client
cloudwatch_logs = boto3.client('logs')
log_groups=[]
# List log groups through the pagination interface
paginator = cloudwatch_logs.get_paginator('describe_log_groups')
for response in paginator.paginate(logGroupNamePrefix=log_group_prefix):
for log_group in response['logGroups']:
log_groups.append(log_group['logGroupName'])
if debug:
print(log_groups)
old_log_groups=[]
empty_log_groups=[]
for log_group in log_groups:
response = cloudwatch_logs.describe_log_streams(
logGroupName=log_group, #logStreamNamePrefix='',
orderBy='LastEventTime',
descending=True,
limit=1
)
# The time of the most recent log event in the log stream in CloudWatch Logs. This number is expressed as the number of milliseconds after Jan 1, 1970 00:00:00 UTC.
if len(response['logStreams']) > 0:
if debug:
print("full response is:")
print(response)
print("Last event is:")
print(response['logStreams'][0]['lastEventTimestamp'])
print("current millis is:")
print(millis)
if response['logStreams'][0]['lastEventTimestamp'] < millis - (days * 24 * 60 * 60 * 1000):
old_log_groups.append(log_group)
else:
empty_log_groups.append(log_group)
# delete log group
if delete:
for log_group in old_log_groups:
response = cloudwatch_logs.delete_log_group(logGroupName=log_group)
#for log_group in empty_log_groups:
# response = cloudwatch_logs.delete_log_group(logGroupName=log_group)
else:
print("old log groups are:")
print(old_log_groups)
print("Number of log groups:")
print(len(old_log_groups))
print("empty log groups are:")
print(empty_log_groups)
I have been using aws-cloudwatch-log-clean and I can say it works quite well.
you need boto3 installed and then you:
./sweep_log_streams.py [log_group_name]
It has a --dry-run option for you to check it what you expect first.
A note of caution, If you have a long running process in ECS which is quiet on the Logs, and the log has been truncated to empty in CW due to the logs retention period. Deleting its empty log stream can break and hang the service, as it has nowhere to post its logs to...
I'm not aware of a simple way to do this, but you could use the awscli (or preferably python/boto3) to describe-log-groups, then for each log group invoke describe-log-streams, then for each log group/stream pair, invoke get-log-events with a --start-time of 30 days ago. If the union of all the events arrays for the log group is empty then you know you can delete the log group.
I did the same setup, I my case I want to delete log stream older than X days from the Cloudwatch Log Stream.
remove.py
import optparse
import sys
import os
import json
import datetime
def deletefunc(loggroupname,days):
os.system("aws logs describe-log-streams --log-group-name {} --order-by LastEventTime > test.json".format(loggroupname))
oldstream=[]
milli= days * 24 * 60 * 60 * 1000
with open('test.json') as json_file:
data = json.load(json_file)
for p in data['logStreams']:
subtract=p['creationTime']+milli
sub1=subtract/1000
sub2=datetime.datetime.fromtimestamp(sub1).strftime('%Y-%m-%d')
op=p['creationTime']/1000
original=datetime.datetime.fromtimestamp(op).strftime('%Y-%m-%d')
name=p['logStreamName']
if original < sub2:
oldstream.append(name)
for i in oldstream:
os.system("aws logs delete-log-stream --log-group-name {} --log-stream-name {}".format(loggroupname,i))
parser = optparse.OptionParser()
parser.add_option('--log-group-name', action="store", dest="loggroupname", help="LogGroupName For Eg: testing-vpc-flow-logs",type="string")
parser.add_option('-d','--days', action="store", dest="days", help="Type No.of Days in Integer to Delete Logs other than days provided",type="int")
options, args = parser.parse_args()
if options.loggroupname is None:
print ("Provide log group name to continue.\n")
cmd = 'python3 ' + sys.argv[0] + ' -h'
os.system(cmd)
sys.exit(0)
if options.days is None:
print ("Provide date to continue.\n")
cmd = 'python3 ' + sys.argv[0] + ' -h'
os.system(cmd)
sys.exit(0)
elif options.days and options.loggroupname:
loggroupname = options.loggroupname
days = options.days
deletefunc(loggroupname,days)
you can run this file using the command:
python3 remove.py --log-group-name=testing-vpc-flow-logs --days=7

How to get Boto3 to append collected cloudtrail client responses to log file when run from a crontask?

I am currently working with a log collection product and want to be able to pull in my CloudTrail logs from AWS. I started using the boto3 client in order to lookup the events in CloudTrail. I got the script to work right when I am running it directly from the commandline, but as soon as I tried to put it in cron to pull the logs automatically over time, it stopped collecting the logs!
Here's a sample of the basics of what's in the script to pull the logs:
#!/usr/bin/python
import boto3
import datetime
import json
import time
import sys
import os
def initialize_log():
try:
log = open('/var/log/aws-cloudtrail.log', 'ab')
except IOError as e:
print " [!] ERROR: Cannot open /var/log/aws-cloudtrail.log (%s)" % (e.strerror)
sys.exit(1)
return log
def date_handler(obj):
return obj.isoformat() if hasattr(obj, 'isoformat') else obj
def read_logs(log):
print "[+] START: Connecting to CloudTrail Logs"
cloudTrail = boto3.client('cloudtrail')
starttime = ""
endtime = ""
if os.path.isfile('/var/log/aws-cloudtrail.bookmark'):
try:
with open('/var/log/aws-cloudtrail.bookmark', 'r') as myfile:
strdate=myfile.read().replace('\n', '')
starttime = datetime.datetime.strptime( strdate, "%Y-%m-%dT%H:%M:%S.%f" )
print " [-] INFO: Found bookmark! Querying with a start time of " + str(starttime)
except IOError as e:
print " [!] ERROR: Cannot open /var/log/aws-cloudtrail.log (%s)" % (e.strerror)
else:
starttime = datetime.datetime.now() - datetime.timedelta(minutes=15)
print " [-] INFO: Cannot find bookmark...Querying with start time of" + str(starttime)
endtime = datetime.datetime.now()
print " [-] INFO: Querying for CloudTrail Logs"
response = cloudTrail.lookup_events(StartTime=starttime, EndTime=endtime, MaxResults=50)
for event in response['Events']:
log.write(json.dumps(event, default=date_handler))
log.write("\n")
print json.dumps(event, default=date_handler)
print "------------------------------------------------------------"
if 'NextToken' in response.keys():
while 'NextToken' in response.keys():
time.sleep(1)
response = cloudTrail.lookup_events(StartTime=starttime, EndTime=endtime, MaxResults=50, NextToken=str(response['NextToken']))
for event in response['Events']:
log.write(json.dumps(event, default=date_handler))
log.write("\n")
print json.dumps(event, default=date_handler)
print "------------------------------------------------------------"
# log.write("\n TESTING 1,2,3 \n")
log.close()
try:
bookmark_file = open('/var/log/aws-cloudtrail.bookmark','w')
bookmark_file.write(str(endtime.isoformat()))
bookmark_file.close()
except IOError as e:
print " [!] ERROR: Cannot set bookmark for last pull time in /var/log/aws-cloudtrail.bookmark (%s)" % (e.strerror)
sys.exit(1)
return True
log = initialize_log()
success = read_logs(log)
if success:
print "[+] DONE: All results printed"
else:
print "[+] ERROR: CloudTrail results were not able to be pulled"
I looked into it more and did some testing to confirm that permissions were right on the destination files and that the script could write to them when run from root's crontab, but I still wasn't getting the logs returned from the boto cloudtrail client unless I ran it manually.
I also checked to make sure that the default region was getting read correctly from /root/.aws/config and it looks like it is, because if I move it I see the cron email show a stack trace instead of the success messages I have built in.
I am hoping someone has already run into this and it's a quick simple answer!
EDIT: The permissions to the cloudtrail logs is allowed via the instance's IAM Role, and yes, the task is scheduled under root's crontab.
Here's the email output:
From root#system Mon Mar 28 23:00:02 2016
X-Original-To: root
From: root#system (Cron Daemon)
To: root#system
Subject: Cron <root#system> /usr/bin/python /root/scripts/get-cloudtrail.py
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/root>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=root>
Date: Mon, 28 Mar 2016 19:00:02 -0400 (EDT)
[+] START: Connecting to CloudTrail Logs
[-] INFO: Found bookmark! Querying with a start time of 2016-03-28 22:55:01.395001
[-] INFO: Querying for CloudTrail Logs
[+] DONE: All results printed

Django error notification emails going to incorrect recipient(s)

I am trying to get Django to email me 500 errors. I have got DEBUG mode turned off as required, and am sending to a local SMTP server. I'm using Python 2.7.2 and Django 1.5.1.
Django is sending using the incorrect and invalid recipients "n, e":
# python -m smtpd -n -c DebuggingServer localhost:25
---------- MESSAGE FOLLOWS ----------
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Subject: [Django] ERROR (EXTERNAL IP): Internal Server Error: /
From: sender#example.com
To: n, e
Extract from settings.py:
ADMINS = (
('Me', 'me#example.com')
)
SEND_BROKEN_LINK_EMAILS = True
MANAGERS = ADMINS
SERVER_EMAIL = 'sender#example.com'
ADMINS = (
('Me', 'me#example.com'),
# ^ missing comma
)
Parenthesis do not create a tuple, the comma does (and empty parenthesis):
>> 1 == (1)
True
>> a = 1,
>> a == (1,)
True
Refer to Tuple Syntax for explanation.