Is there a way via the API to export Mailgun's logs to a local file for long term storage? We need to keep our mailing logs for over the 30 days Mailgun provides for.

You can only request 300 events at a time, so you'll have to continue fetching the next page until you run out of results. You can then do whatever you'd like with the log items, such as generate a csv, or add items in your database. Check out https://documentation.mailgun.com/en/latest/api-events.html#events for the API docs. Here's an example in Python:
import requests
import csv
from datetime import datetime, timedelta
DATETIME_FORMAT = '%d %B %Y %H:%M:%S -0000'
def get_logs(start_date, end_date, next_url=None):
if next_url:
logs = requests.get(next_url,auth=("api", [YOUR MAILGUN ACCESS KEY]))
logs = requests.get(
auth=("api", [YOUR MAILGUN ACCESS KEY]),
params={"begin" : start_date.strftime(DATETIME_FORMAT),
"end" : end_date.strftime(DATETIME_FORMAT),
"ascending" : "yes",
"pretty" : "yes",
"limit" : 300,
"event" : "accepted",}
return logs.json()
start = datetime.now() - timedelta(2)
end = timezone.now() - timedelta(1)
log_items = []
current_page = get_logs(start, end)
while current_page.get('items'):
items = current_page.get('items')
next_url = current_page.get('paging').get('next', None)
current_page = get_logs(start, end, next_url=next_url)
keys = log_items[0].keys()
with open('mailgun{0}.csv'.format(start.strftime('%Y-%M-%d')), 'wb') as output_file:
dict_writer = csv.DictWriter(output_file, keys)

There's a simple python script to retrieve logs for a domain, however i haven't checked if it hits the events api instead of the now deprecated logs api...

The original answer doesn't work without modifications. Here is the updated code that works:
#!/usr/bin/env python3
# Uses the Mailgun API to save logs to JSON file
# Set environment variables MAILGUN_API_KEY and MAILGUN_SERVER
# Optionally set MAILGUN_LOG_DAYS to number of days to retrieve logs for
# Based on https://stackoverflow.com/a/49825979
# See API guide https://documentation.mailgun.com/en/latest/api-intro.html#introduction
import os
import json
import requests
from datetime import datetime, timedelta
from email import utils
DAYS_TO_GET = os.environ.get("MAILGUN_LOG_DAYS", 7)
print("Set environment variable MAILGUN_API_KEY and MAILGUN_SERVER")
ITEMS_PER_PAGE = 300 # API is limited to 300
def get_logs(start_date, next_url=None):
if next_url:
print(f"Getting next batch of {ITEMS_PER_PAGE} from {next_url}...")
response = requests.get(next_url,auth=("api", MAILGUN_API_KEY))
url = 'https://api.mailgun.net/v3/{0}/events'.format(MAILGUN_SERVER)
start_date_formatted = utils.format_datetime(start_date) # Mailgun wants it in RFC 2822
print(f"Getting first batch of {ITEMS_PER_PAGE} from {url} since {start_date_formatted}...")
response = requests.get(
auth=("api", MAILGUN_API_KEY),
params={"begin" : start_date_formatted,
"ascending" : "yes",
"pretty" : "yes",
"limit" : ITEMS_PER_PAGE,
"event" : "accepted",}
return response.json()
start = datetime.now() - timedelta(DAYS_TO_GET)
log_items = []
current_page = get_logs(start)
while current_page.get('items'):
items = current_page.get('items')
print(f"Retrieved {len(items)} records for a total of {len(log_items)}")
next_url = current_page.get('paging').get('next', None)
current_page = get_logs(start, next_url=next_url)
file_out = f"mailgun-logs-{MAILGUN_SERVER}_{start.strftime('%Y-%m-%d')}_to_{datetime.now().strftime('%Y-%m-%d')}.json"
print(f"Writing out {file_out}")
with open(file_out, 'w') as file_out_handle:
json.dump(log_items, file_out_handle, indent=4)

You can have a look at MailgunLogger.
It's an open source project that can easily be deployed via Docker to fetch and store Mailgun events in a database. It features a dead simple, although rudimentary, search and allows you to add multiple accounts/domains.
Run via Docker:
docker run -d -p 5050:5050 \
-e "ML_DB_USER=username" \
-e "ML_DB_PASSWORD=password" \
-e "ML_DB_NAME=mailgun_logger" \
-e "ML_DB_HOST=my_db_host" \
--name mailgun_logger jackjoe/mailgun_logger
From there on, the interface guides you to configure everything.
In the OP case, this project can be used in a more headless fashion where you only use the database instead of the provided UI.

You can use Skyvia for exporting logs from Mailgun for LTS. Skyvia is a cloud tool for automatic Mailgun CSV import/export with powerful transformations. You can also export Mailgun ListMembers, Templates, Tags, etc. to CSV automatically on a schedule.


How can I get list of all cloud SQL ( GCP ) instances which are stopped in python, I am using google cloud api for this purpose

from googleapiclient import discovery
PROJECT = gcp-test-1234
sql_client = discovery.build('sqladmin', 'v1beta4')
resp = sql_client.instances().list(project=PROJECT).execute()
But in response, I am getting a state as "RUNNABLE" for stopped instances, so how can I verify that the instance is running or stopped programmatically
I have also check gcloud sql instances describe gcp-test-1234-test-db, it is providing state as "STOPPED"
how can I achieve this programmatically using python
In the Rest API, the RUNNABLE for the state field means that the instance is running, or has been stopped by the owner, as stated here.
You need to read from the activationPolicy field, where ALWAYS means your instance is running and NEVER means it is stopped. Something like the following will work:
from pprint import pprint
from googleapiclient import discovery
service = discovery.build('sqladmin', 'v1beta4')
project = 'gcp-test-1234'
instance = 'gcp-test-1234-test-db'
request = service.instances().get(project=project,instance=instance)
response = request.execute()
Another option would be to use the Cloud SDK command directly from your python file:
import os
os.system("gcloud sql instances describe gcp-test-1234-test-db | grep state | awk {'print $2'}")
Or with subprocess:
import subprocess
subprocess.run("gcloud sql instances describe gcp-test-1234-test-db | grep state | awk {'print $2'}", shell=True)
Note that when you run gcloud sql instances describe you-instance --log-http on a stopped instance, in the response of the API, you'll see "state": "RUNNABLE", however, the gcloud command will show the status STOPPED. This is because the output of the command gets the status from the activationPolicy of the API response rather than the status, if the status is RUNNABLE.
If you want to check the piece of code that translates the activationPolicy to the status, you can see it in the SDK. The gcloud tool is written in python:
cat $(gcloud info --format "value(config.paths.sdk_root)")/lib/googlecloudsdk/api_lib/sql/instances.py|grep "class DatabaseInstancePresentation(object)" -A 17
You'll se the following:
class DatabaseInstancePresentation(object):
"""Represents a DatabaseInstance message that is modified for user visibility."""
def __init__(self, orig):
for field in orig.all_fields():
if field.name == 'state':
if orig.settings and orig.settings.activationPolicy == messages.Settings.ActivationPolicyValueValuesEnum.NEVER:
self.state = 'STOPPED'
self.state = orig.state
value = getattr(orig, field.name)
if value is not None and not (isinstance(value, list) and not value):
if field.name in ['currentDiskSize', 'maxDiskSize']:
setattr(self, field.name, six.text_type(value))
setattr(self, field.name, value)

Google Cloud Pus/Sub :: google.api_core.exceptions.DeadlineExceeded: 504 Deadline Exceeded

I was testing streaming processing of google cloud pub/sub.
Forward message from publisher to topic, reading the message on the pub/sub on apache-beam and checking it with beam.Map(print).
Reading messages from the pub/sub, it worked. But, an error occurred after reading the messages all.
ㅡ. This code delivers messages from publisher to topic
from google.cloud import pubsub_v1
from google.cloud import bigquery
import time
# TODO(developer)
project_id = [your-project-id]
topic_id = [your-topic-id]
# Construct a BigQuery client object.
client = bigquery.Client()
# Configure the batch to publish as soon as there is ten messages,
# one kilobyte of data, or one second has passed.
batch_settings = pubsub_v1.types.BatchSettings(
max_messages=10, # default 100
max_bytes=1024, # default 1 MB
max_latency=1, # default 10 ms'
publisher = pubsub_v1.PublisherClient(batch_settings)
topic_path = publisher.topic_path(project_id, topic_id)
query = """
FROM `[bigquery-schema.bigquery-dataset.bigquery-tablename]`
query_job = client.query(query)
# Resolve the publish future in a separate thread.
def callback(topic_message):
message_id = topic_message.result()
print("The query data:")
for row in query_job:
data = u"category={}, language={}, count={}".format(row[0], row[1], row[2])
data = data.encode("utf-8")
topic_message = publisher.publish(topic_path, data=data)
print("Published messages with batch settings.")
ㅡ. Apache-beam code [for reading and processing data from pub/sub]
import argparse
import datetime
import json
import logging
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
import apache_beam.transforms.window as window
pipeline_options = PipelineOptions(
class GroupWindowsIntoBatches(beam.PTransform):
"""A composite transform that groups Pub/Sub messages based on publish
time and outputs a list of dictionaries, where each contains one message
and its publish timestamp.
def __init__(self, window_size):
# Convert minutes into seconds.
self.window_size = int(window_size * 60)
def expand(self, pcoll):
return (
# Assigns window info to each Pub/Sub message based on its
# publish timestamp.
| "Window into Fixed Intervals"
>> beam.WindowInto(window.FixedWindows(self.window_size))
| "Add timestamps to messages" >> beam.ParDo(AddTimestamps())
# Use a dummy key to group the elements in the same window.
# Note that all the elements in one window must fit into memory
# for this. If the windowed elements do not fit into memory,
# please consider using `beam.util.BatchElements`.
# https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.util.html#apache_beam.transforms.util.BatchElements
| "Add Dummy Key" >> beam.Map(lambda elem: (None, elem))
| "Groupby" >> beam.GroupByKey()
| "Abandon Dummy Key" >> beam.MapTuple(lambda _, val: val)
class AddTimestamps(beam.DoFn):
def process(self, element, publish_time=beam.DoFn.TimestampParam):
"""Processes each incoming windowed element by extracting the Pub/Sub
message and its publish timestamp into a dictionary. `publish_time`
defaults to the publish timestamp returned by the Pub/Sub server. It
is bound to each element by Beam at runtime.
yield {
"message_body": element.decode("utf-8"),
"publish_time": datetime.datetime.utcfromtimestamp(
).strftime("%Y-%m-%d %H:%M:%S.%f"),
class WriteBatchesToGCS(beam.DoFn):
def __init__(self, output_path):
self.output_path = output_path
def process(self, batch, window=beam.DoFn.WindowParam):
"""Write one batch per file to a Google Cloud Storage bucket. """
ts_format = "%H:%M"
window_start = window.start.to_utc_datetime().strftime(ts_format)
window_end = window.end.to_utc_datetime().strftime(ts_format)
filename = "-".join([self.output_path, window_start, window_end])
with beam.io.gcp.gcsio.GcsIO().open(filename=filename, mode="w") as f:
for element in batch:
class test_func(beam.DoFn) :
def __init__(self, delimiter=','):
self.delimiter = delimiter
def process(self, topic_message):
def run(input_topic, output_path, window_size=1.0, pipeline_args=None):
# `save_main_session` is set to true because some DoFn's rely on
# globally imported modules.
pipeline_options = PipelineOptions(
pipeline_args, streaming=True, save_main_session=True
with beam.Pipeline(options=pipeline_options) as pipeline:
| "Read PubSub Messages"
>> beam.io.ReadFromPubSub(topic=input_topic)
| "Pardo" >> beam.ParDo(test_func(','))
if __name__ == "__main__": # noqa
input_topic = 'projects/[project-id]/topics/[pub/sub-name]'
output_path = 'gs://[bucket-name]/[file-directory]'
run(input_topic, output_path, 2)
# [END pubsub_to_gcs]
As a temporary measure, I set return_immediately=True. but, This is not a fundamental solution either.
Thank you for reading it.
This seems to be a known issue of the PubSub libraries reported in other SO thread and it looks that it was recently addressed with version 1.4.2 but not yet included in the BEAM dependencies that's still using google-cloud-pubsub>=0.39.0,<1.1.0.
I made some research and found that DataflowRunner appears to handle this error better than DirectRunner, which is maintained by Apache Beam team. The issue has been already reported on beam site, and it's not resolved yet.
Also please be advised that the troubleshooting guide for DEADLINE_EXCEEDED error can be found here. You can check if any of the presented advices could help in your case, such as upgrading to the latest version of the client library.

How to get all the VM's information for all Projects in GCP

How to get all the VM's information for all Projects in GCP.
I have multiple Projects in My GCP account and I need the Operating System, Version of Operating of System and Build Version of the Operating System for All the VM's for all Project in GCP.
I didn't find a tool to that, so I code something that you can use.
This code must be improved, but here you can find a way to scan all project and get information about the OS.
Let me know if it helps you.
Pip install:
!pip install google-cloud
!pip install google-api-python-client
!pip install oauth2client
import subprocess
import sys
import logging
import threading
import pprint
logger = logging.Logger('catch_all')
def execute_bash(parameters):
return subprocess.check_output(parameters)
except Exception as e:
logger.error('ERROR: Looking in jupyter console for more information')
def scan_gce(project, results_scan):
print('Scanning project: "{}"'.format(project))
ex = execute_bash(['gcloud','compute', 'instances', 'list', '--project', project, '--format=value(name,zone, status)'])
list_result_vms = []
if ex:
list_vms = ex.decode("utf-8").split('\n')
for vm in list_vms:
if vm:
vm_info = vm.split('\t')
print('Scanning Instance: "{}" in project "{}"'.format(vm_info[0], project))
results_bytes = execute_bash(['gcloud', 'compute', '--project',project,
'ssh', '--zone', vm_info[1], vm_info[0],
'--command', 'cat /etc/*-release' ])
if results_bytes:
results = results_bytes.decode("utf-8").split('\n')
list_result_vms.append({'instance_name': vm_info[0],'result':results})
results_scan.append({'project':project, 'vms':list_result_vms})
list_projects = execute_bash(['gcloud','projects', 'list', '--format=value(projectId)']).decode("utf-8").split('\n')
threads_project = []
results_scan = []
for project in list_projects :
t = threading.Thread(target=scan_gce, args=(project, results_scan))
for t in threads_project:
for result in results_scan:
You can find the full code here:
Wuick and dirty:
gcloud projects list --format 'value(PROJECT_ID)' >> proj_list
cat proj_list | while read pj; do gcloud compute instances list --project $pj; done
You can use the following command in the Cloud Shell to fetch all projects and then show the instances for each of them:
for i in $(gcloud projects list | sed 1d | cut -f1 -d$' '); do
gcloud compute instances list --project $i;done;
note: make sure you have compute.instances.list permission to all of the projects
Here is how you do it using the pip3 install -U google-api-python-client without using bash. Note, this is to be ran with keyless auth. Using service account keys is bad practice.
from googleapiclient import discovery
from googleapiclient.errors import HttpError
import yaml
import structlog
logger = structlog.stdlib.get_logger()
def get_projects() -> list:
projects: list = []
service = discovery.build('cloudresourcemanager','v1', cache_discovery=False)
request = service.projects().list()
response = request.execute()
for project in response.get('projects'):
logger.debug('got projects', projects=projects)
return projects
def get_zones(project: str) -> list:
zones: list = []
service = discovery.build('compute','v1', cache_discovery=False)
request = service.zones().list(project=project)
while request is not None:
response = request.execute()
if not 'items' in response:
logger.warn('no zones found')
return {}
for zone in response.get('items'):
request = service.zones().list_next(previous_request=request,previous_response=response)
logger.debug('got zones', zones=zones)
return zones
def get_vms() -> list:
vms: list = []
projects: list = get_projects()
service = discovery.build('compute', 'v1', cache_discovery=False)
for project in projects:
zones: list = get_zones(project)
for zone in zones:
request = service.instances().list(project=project, zone=zone)
response = request.execute()
if 'items' in response:
for vm in response.get('items'):
ips: list = []
for interface in vm.get('networkInterfaces'):
vms.append({vm.get('name'): {'self_link': vm.get('selfLink'), 'ips': ips}})
except HttpError:
logger.debug('got vms', vms=vms)
return vms
if __name__ == '__main__':
data = get_vms()
with open('output.yaml', 'w') as fh:
yaml.dump(data, fh)

AWS SQS: Sending dynamic message using boto

I have a working python/boto script which posts a message to my AWS SQS queue. The message body however is hardcoded into the script.
I creates a file called ~/file which contains two values
$ cat ~/file
Username 'encrypted_password_string'
I would like my boto script (see below) to send a message to my AWS SQS queue that contains these two values.
Can anyone please advise how to modify my script below so the message body sent to SQS contains the contents of file ~/file. Please also take note of the special characters that exists within a encrypted password string
username d5MopV/EsfSKk8BExCyLHFwNfBrOTzQ1
#!/usr/bin/env python
conf = {
"sqs-access-key": "xxxx",
"sqs-secret-key": "xxxx",
"sqs-queue-name": "UserPassChange",
"sqs-region": "xxxx",
"sqs-path": "sqssend"
import boto.sqs
conn = boto.sqs.connect_to_region(
aws_access_key_id = conf.get('sqs-access-key'),
aws_secret_access_key = conf.get('sqs-secret-key')
q = conn.create_queue(conf.get('sqs-queue-name'))
from boto.sqs.message import RawMessage
m = RawMessage()
m.set_body('hardcoded message')
retval = q.write(m)
print 'added message, got retval: %s' % retval
one way to get it working:
in the script I added
import commands
then added,
USERNAME = commands.getoutput("echo $(who am i | awk '{print $1}')")
PASS = commands.getoutput("cat /tmp/.s")
and then added these values to my message body :
MSG = RawMessage()
MSG.set_body(json.dumps({'pass': PASS, 'user': USERNAME}))
The following example shows how to use Boto3 to send a file to a receiver.
import boto3
from moto import mock_sqs
def test_sqs():
sqs = boto3.resource('sqs', 'us-east-1')
queue = sqs.create_queue(QueueName='votes')
messages = queue.receive_messages()
assert len(messages) == 1
assert messages[0].body == 'tasty\n'

Uploading video to YouTube and adding it to playlist using YouTube Data API v3 in Python

I wrote a script to upload a video to YouTube using YouTube Data API v3 in the python with help of example given in Example code.
And I wrote another script to add uploaded video to playlist using same YouTube Data API v3 you can be seen here
After that I wrote a single script to upload video and add that video to playlist. In that I took care of authentication and scops still I am getting permission error. here is my new script
import httplib
import httplib2
import os
import random
import sys
import time
from apiclient.discovery import build
from apiclient.errors import HttpError
from apiclient.http import MediaFileUpload
from oauth2client.file import Storage
from oauth2client.client import flow_from_clientsecrets
from oauth2client.tools import run
# Explicitly tell the underlying HTTP transport library not to retry, since
# we are handling retry logic ourselves.
httplib2.RETRIES = 1
# Maximum number of times to retry before giving up.
# Always retry when these exceptions are raised.
RETRIABLE_EXCEPTIONS = (httplib2.HttpLib2Error, IOError, httplib.NotConnected,
httplib.IncompleteRead, httplib.ImproperConnectionState,
httplib.CannotSendRequest, httplib.CannotSendHeader,
httplib.ResponseNotReady, httplib.BadStatusLine)
# Always retry when an apiclient.errors.HttpError with one of these status
# codes is raised.
RETRIABLE_STATUS_CODES = [500, 502, 503, 504]
CLIENT_SECRETS_FILE = "client_secrets.json"
# A limited OAuth 2 access scope that allows for uploading files, but not other
# types of account access.
YOUTUBE_UPLOAD_SCOPE = "https://www.googleapis.com/auth/youtube.upload"
# Helpful message to display if the CLIENT_SECRETS_FILE is missing.
WARNING: Please configure OAuth 2.0
To make this sample run you will need to populate the client_secrets.json file
found at:
with information from the APIs Console
For more information about the client_secrets.json file format, please visit:
""" % os.path.abspath(os.path.join(os.path.dirname(__file__),
def get_authenticated_service():
flow = flow_from_clientsecrets(CLIENT_SECRETS_FILE, scope=YOUTUBE_UPLOAD_SCOPE,
storage = Storage("%s-oauth2.json" % sys.argv[0])
credentials = storage.get()
if credentials is None or credentials.invalid:
credentials = run(flow, storage)
def initialize_upload(title,description,keywords,privacyStatus,file):
youtube = get_authenticated_service()
tags = None
if keywords:
tags = keywords.split(",")
insert_request = youtube.videos().insert(
# chunksize=-1 means that the entire file will be uploaded in a single
# HTTP request. (If the upload fails, it will still be retried where it
# left off.) This is usually a best practice, but if you're using Python
# older than 2.6 or if you're running on App Engine, you should set the
# chunksize to something like 1024 * 1024 (1 megabyte).
media_body=MediaFileUpload(file, chunksize=-1, resumable=True)
#Here I added lines to add video to playlist
#youtube = get_authenticated_service()
'snippet': {
'playlistId': "PL2JW1S4IMwYubm06iDKfDsmWVB-J8funQ",
'resourceId': {
'kind': 'youtube#video',
'videoId': vid
#'position': 0
def resumable_upload(insert_request):
response = None
error = None
retry = 0
while response is None:
print "Uploading file..."
status, response = insert_request.next_chunk()
if 'id' in response:
print "'%s' (video id: %s) was successfully uploaded." % (
title, response['id'])
exit("The upload failed with an unexpected response: %s" % response)
except HttpError, e:
if e.resp.status in RETRIABLE_STATUS_CODES:
error = "A retriable HTTP error %d occurred:\n%s" % (e.resp.status,
error = "A retriable error occurred: %s" % e
if error is not None:
print error
retry += 1
if retry > MAX_RETRIES:
exit("No longer attempting to retry.")
max_sleep = 2 ** retry
sleep_seconds = random.random() * max_sleep
print "Sleeping %f seconds and then retrying..." % sleep_seconds
return vid
if __name__ == '__main__':
title="sample title"
description="sample description"
print 'video ID is :',vid
I am not able to figure out what is wrong. I am getting permission error. both script works fine independently.
could anyone help me figure out where I am wrong or how to achieve uploading video and adding that too playlist.
I got the answer actually in both the independent script scope is different.
scope for uploading is "https://www.googleapis.com/auth/youtube.upload"
scope for adding to playlist is "https://www.googleapis.com/auth/youtube"
as scope is different so I had to handle authentication separately.