Google Cloud Storage giving ServiceUnavailable: 503 exception Backend Error - google-cloud-platform

I'm trying to upload a file to Google Cloud Storage Bucket. While making it public, intermittently I'm getting this exception from Google. This error comes almost once in 20 uploads.
google.api_core.exceptions.ServiceUnavailable: 503 GET https://www.googleapis.com/storage/v1/b/bucket_name/o/folder_name%2FPolicy-APP-000456384.2019-05-16-023805.pdf/acl: Backend Error
I'm using python3 and have tried updating the version of google-cloud-storage to 1.15.0 but it didn't help.
class GoogleStorageHelper:
def __init__(self, project_name):
self.client = storage.Client(project=project_name)
def upload_file(self, bucket_name, file, file_name, content_type, blob_name, is_stream):
safe_file_name = self.get_safe_filename(file_name)
bucket = self.client.bucket(bucket_name)
blob = bucket.blob(safe_file_name)
if is_stream:
blob.upload_from_string(file, content_type=content_type)
else:
blob.upload_from_filename(file, content_type=content_type)
blob.make_public() // Getting Error here
url = blob.public_url
if isinstance(url, six.binary_type):
url = url.decode('utf-8')
logger.info('File uploaded, URL: {}'.format(url))
return url
#staticmethod
def get_safe_filename(file_name):
basename, extension = file_name.rsplit('.', 1)
return '{0}.{1}.{2}'.format(basename, datetime.now().strftime('%Y-%m-%d-%H%M%S'), extension)
Have you faced this kind of problem and solved it? Or have any ideas to fix this issue?

This is a known issue recently with GCS using Python make_public() method. The problem is now being worked on by the GCS team.
I'd suggest, as a quick mitigation strategy, to enable retries. This documentation could be helpful in setting up Retry Handling Strategy.

This one is a bit tricky. I ran into the same issue and found the Python API client doesn't enable retries for the upload_from_string() method.
All upload_from_string() does is call the upload_from_file() method, which has retries but the implementation ignores retries.
def upload_from_string(self,
data,
content_type="text/plain",
client=None,
predefined_acl=None):
data = _to_bytes(data, encoding="utf-8")
string_buffer = BytesIO(data)
self.upload_from_file(
file_obj=string_buffer,
size=len(data),
content_type=content_type,
client=client,
predefined_acl=predefined_acl,
)
You can hack the upload_from_string() method by using the upload_from_file() implementation, adding retries:
from google.cloud._helpers import _to_bytes
from io import BytesIO
from google.cloud.storage import Blob
def upload_from_string(
data, file_path, bucket, client, content_type, num_retries
):
data = _to_bytes(data, encoding="utf-8")
string_buffer = BytesIO(data)
blob = Blob(file_path, bucket)
blob.upload_from_file(
file_obj=string_buffer,
size=len(data),
client=client,
num_retries=num_retries,
content_type=content_type
)

To handle this error gracefully and wait as suggested by the 503 docs, note that these errors inherit from GoogleAPICallError, therefore can be parsed for the error code:
from google.api_core.exceptions import GoogleAPICallError
try:
blob.upload_from_filename(YOUR_UPLOAD_PARAMETERS)
except GoogleAPICallError as e:
if e.code == 503:
print(f'GCP storage unavailable: {e}')
... # handle the error gracefully, or simply ignore
else:
raise
Additionally, you may use the retry.Retry as suggested in the doc:
blob.upload_from_filename(YOUR_UPLOAD_PARAMETERS, retry=retry.Retry())

Related

Dajngo CSV FIle not download ? When we have a large CSV file download its takes some time?Django 502 bad gateway nginx error Django

How can I download a large CSV file that shows me a 502 bad gateway error?
I get this solution I added in below.
Actually, in this, we use streaming references. In this concept for example we download a movie it's will download in the browser and show status when complete this will give the option to show in a folder same as that CSV file download completely this will show us.
There is one solution for resolving this error to increase nginx time but this is will affect cost so better way to use Django streaming. streaming is like an example when we add a movie for download it's downloading on the browser. This concept is used in Django streaming.
Write View for this in Django.
views.py
from django.http import StreamingHttpResponse
503_ERROR = 'something went wrong.'
DASHBOARD_URL = 'path'
def get_headers():
return ['field1', 'field2', 'field3']
def get_data(item):
return {
'field1': item.field1,
'field2': item.field2,
'field3': item.field3,
}
class CSVBuffer(object):
def write(self, value):
return value
class Streaming_CSV(generic.View):
model = Model_name
def get(self, request, *args, **kwargs):
try:
queryset = self.model.objects.filter(is_draft=False)
response = StreamingHttpResponse(streaming_content=(iter_items(queryset, CSVBuffer())), content_type='text/csv', )
file_name = 'Experience_data_%s' % (str(datetime.datetime.now()))
response['Content-Disposition'] = 'attachment;filename=%s.csv' % (file_name)
except Exception as e:
print(e)
messages.error(request, ERROR_503)
return redirect(DASHBOARD_URL)
return response
urls.py
path('streaming-csv/',views.Streaming_CSV.as_view(),name = 'streaming-csv')
For reference use the below links.
https://docs.djangoproject.com/en/4.0/howto/outputting-csv/#streaming-large-csv-files
GIT.
https://gist.github.com/niuware/ba19bbc0169039e89326e1599dba3a87
GIT
Adding rows manually to StreamingHttpResponse (Django)

google-ml-engine custom prediction routine error responses

I have a custom prediction routine in google-ml-engine. Works very well.
I now am doing input checking on the instance data, and want to return error responses from my predict routine.
The example: https://cloud.google.com/ai-platform/prediction/docs/custom-prediction-routines
Raises exceptions on input errors, etc. However, when this happens the response body always has {'error': Prediction failed: unknown error}. I can see the correct errors are being logged in google cloud console, but the https response is always the same unknown error.
My question is:
How to make the Custom prediction routine return a proper error code and error message string?
Instead of returning a prediction, I can return an error string/code in prediction -but it ends up in the prediction part of the response which seems hacky and doesn't get any of the google errors eg based on instance size.
root:test_deployment.py:35 {'predictions': {'error': "('Instance does not include required sensors', 'occurred at index 0')"}}
What's the best way to do this?
Thanks!
David
Please take a look at the following code, I created a _validate function inside predict and use a custom Exception class.
Basically, I validate instances, before I call the model predict method and handle the exception.
There may be some overhead to the response time when doing this validation, which you need to test for your use case.
requests = [
"god this episode sucks",
"meh, I kinda like it",
"what were the writer thinking, omg!",
"omg! what a twist, who would'v though :o!",
99999
]
api = discovery.build('ml', 'v1')
parent = 'projects/{}/models/{}/versions/{}'.format(PROJECT, MODEL_NAME, VERSION_NAME)
parent = 'projects/{}/models/{}'.format(PROJECT, MODEL_NAME)
response = api.projects().predict(body=request_data, name=parent).execute()
{'predictions': [{'Error code': 1, 'Message': 'Invalid instance type'}]}
Custom Prediction class:
import os
import pickle
import numpy as np
import logging
from datetime import date
import tensorflow.keras as keras
class CustomModelPredictionError(Exception):
def __init__(self, code, message='Error found'):
self.code = code
self.message = message # you could add more args
def __str__(self):
return str(self.message)
def isstr(s):
return isinstance(s, str) or isinstance(s, bytes)
def _validate(instances):
for instance in instances:
if not isstr(instance):
raise CustomModelPredictionError(1, 'Invalid instance type')
return instances
class CustomModelPrediction(object):
def __init__(self, model, processor):
self._model = model
self._processor = processor
def _postprocess(self, predictions):
labels = ['negative', 'positive']
return [
{
"label":labels[int(np.round(prediction))],
"score":float(np.round(prediction, 4))
} for prediction in predictions]
def predict(self, instances, **kwargs):
try:
instances = _validate(instances)
except CustomModelPredictionError as c:
return [{"Error code": c.code, "Message": c.message}]
else:
preprocessed_data = self._processor.transform(instances)
predictions = self._model.predict(preprocessed_data)
labels = self._postprocess(predictions)
return labels
#classmethod
def from_path(cls, model_dir):
model = keras.models.load_model(
os.path.join(model_dir,'keras_saved_model.h5'))
with open(os.path.join(model_dir, 'processor_state.pkl'), 'rb') as f:
processor = pickle.load(f)
return cls(model, processor)
Complete code in this notebook.
If it is still relevant to you, I found a way by using google internal libraries (not sure if it would be endorsed by Google though).
AI platform custom prediction wrapping code only returns custom error message if the Exception thrown is a specific one from their internal library.
It might also not be super reliable as you would have very little control in case Google wants to change it.
class Predictor(object):
def predict(self, instances, **kwargs):
# Your prediction code here
# This is an internal google library, it should be available at prediction time.
from google.cloud.ml.prediction import prediction_utils
raise prediction_utils.PredictionError(0, "Custom error message goes here")
#classmethod
def from_path(cls, model_dir):
# Your logic to load the model here
You would get the following message in your HTTP response
Prediction failed: Custom error message goes here

Are there any Django packages to create signed urls for Google Cloud Storage resources?

I'm writing a fairly simply photo app using django-rest-framework for the API and django-storages for the storage engine. The front end is being written in Vue.js. I have the uploading part working, and now I'm trying to serve up the photos. As now seems obvious when the browser tries to load the images from GCS, I just get a bunch of 403 Forbidden errors. I did some reading up on this and it seems that the best practice in my case would be to generate signed urls that expire in some amount of time. I haven't been able to find a package for this, which is what I was hoping for. Short of that, it's not clear to me precisely how to do this in Django.
This is a working code in django 1.11 with python3.5.
import os
from google.oauth2 import service_account
from google.cloud import storage
class CloudStorageURLSigner(object):
#staticmethod
def get_video_signed_url(bucket_name, file_path):
creds = service_account.Credentials.from_service_account_file(
os.environ.get('GOOGLE_APPLICATION_CREDENTIALS')
)
bucket = storage.Client().get_bucket(bucket_name)
blob = bucket.blob(file_path)
signed_url = blob.generate_signed_url(
method='PUT',
expiration=1545367030, #epoch time
content_type='audio/mpeg', #change_accordingly
credentials=creds
)
return signed_url
Yes, take a look at google-cloud-storage
Installation:
pip install google-cloud-storage
Also, make sure to refer to API Documentation as you need more things.
Hope it helps!
I ended up solving this problem by using to_representation in serializers.py:
from google.cloud.storage import Blob
client = storage.Client()
bucket = client.get_bucket('myBucket')
def to_representation(self, value):
try:
blob = Blob(name=value.name, bucket=bucket)
signed_url = blob.generate_signed_url(expiration=datetime.timedelta(minutes=5))
return signed_url
except ValueError as e:
print(e)
return value
Extending #Evan Zamir's answer, instead of reassigning client and bucket you can get them from Django's default_storage (this will save time since these are already available).
This is in settings.py
from datetime import timedelta
from google.oauth2 import service_account
GS_CREDENTIALS = service_account.Credentials.from_service_account_file('credentials.json')
DEFAULT_FILE_STORAGE = "storages.backends.gcloud.GoogleCloudStorage"
GS_BUCKET_NAME = "my-bucket"
GS_EXPIRATION = timedelta(seconds=60)
In serializers.py
from django.core.files.storage import default_storage
from google.cloud.storage import Blob
from rest_framework import serializers
class SignedURLField(serializers.FileField):
def to_representation(self, value):
try:
blob = Blob(name=value.name, bucket=default_storage.bucket)
signed_url = blob.generate_signed_url(expiration=default_storage.expiration)
return signed_url
except ValueError as e:
print(e)
return value
You can use this class in your serializer like this,
class MyModelSerializer(serializers.ModelSerializer):
file = SignedURLField()
Note: Do not provide GS_DEFAULT_ACL = 'publicRead' if you want signed URLs as it creates public URLs (that do not expire)

Django Elasticsearch AWS httplib UnicodeDecodeError

I'm trying to setup Elasticsearch with Django (without Haystack).
Everything works perfectly locally.
But when I try to use the elasticsearch-py client with IAM based authentication on AWS I get this error :
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 827, in _send_output
msg += message_body
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position132: ordinal not in range(128)
I tried to use aws-es-connection and requests-aws4auth but I get the same error.
It only works when I allow open access on AWS and use only elasticsearch-py like so
from elasticsearch import Elasticsearch, RequestsHttpConnection
ES_CLIENT = Elasticsearch(
['search-domain-xxx.us-east-1.es.amazonaws.com'],
connection_class=RequestsHttpConnection
)
But I want something more secure ...
I think it is a utf-8/unicode/str problem but I can't manage to resolve it :(
I finally manage to fix the bug by using a custom serializer :
from elasticsearch import Elasticsearch, RequestsHttpConnection, serializer, compat, exceptions
class JSONSerializerPython2(serializer.JSONSerializer):
"""Override elasticsearch library serializer to ensure it encodes utf characters during json dump.
See original at: https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/serializer.py#L42
A description of how ensure_ascii encodes unicode characters to ensure they can be sent across the wire
as ascii can be found here: https://docs.python.org/2/library/json.html#basic-usage
"""
def dumps(self, data):
# don't serialize strings
if isinstance(data, compat.string_types):
return data
try:
return json.dumps(data, default=self.default, ensure_ascii=True)
except (ValueError, TypeError) as e:
raise exceptions.SerializationError(data, e)
and then pass the serializer to elasticsearch
from elasticsearch import Elasticsearch
es = Elasticsearch(..., serializer=JSONSerializerPython2())
I found the solution here
If someone can find the reason behind this error, I would be immensely grateful. Things I have tried with no luck:
encode things to 'utf-8' before writing (result: no errors thrown, es index not even created)
something like the code-snippet below (result: no errors thrown, es index not even created)
def convert_unicode_to_str(item):
if isinstance(item, basestring):
return str(item)
.....
.....
[Note: This is a bit of a hack, but it gets the job done]
So, sys.setdefaultencoding(..) is not available in the sys module's namespace. This is solved by calling the reload function on sys.
Now, prior to writing to ES, setting the default encoding to 'utf-8' solves this problem. Just to be sure that nothing else breaks as a result of this, I think it is a good idea to reset the default encoding to what it was before the switch.
reload(sys)
# keep track of what the old encoding was
old_encoding = sys.getdefaultencoding()
# set the default encoding to `itf-8`
sys.setdefaultencoding('utf-8')
#################
## write to es ##
#################
# reset the state of the world to what it was
sys.setdefaultencoding(old_encoding)
extending Debosmit Ray's solution for django-haystack.
add to search_indexes:
def set_default_utf8(method):
"""
Set utf-8 as default encoding, perform `method`,
then set the old encoding back.
"""
def wrapper(*args, **kwargs):
reload(sys)
old_encoding = sys.getdefaultencoding()
sys.setdefaultencoding('utf-8')
result = method(*args, **kwargs)
sys.setdefaultencoding(old_encoding)
return result
return wrapper
class EntryIndex(SearchIndex):
text = CharField(document=True, use_template=True)
...
#set_default_utf8
def update(self, using=None):
"""Overrides entire index update to use UTF-8."""
super(StreamEntryIndex, self).update(using)
#set_default_utf8
def update_object(self, instance, using=None, **kwargs):
"""Overrides index update for a single object using UTF-8."""
super(StreamEntryIndex, self).update_object(instance, using, **kwargs)

Deleting a video from youtube YouTube Data API v3 and python

I'm developing an application using Django and angularJS.
One of the major thing that worker server (coded in python, flask) does is downloading videos from s3 (which are uploaded by users) and uploading the videos to youtube.
Is there way to "delete a youtube video in python"?.
There is no such a code example written in python.
Does anyone know how to do this simply, like the code example below?
This is sample code for uploading video. I referred this code and implemented uploading feature.
def get_authenticated_service(args):
flow = flow_from_clientsecrets(CLIENT_SECRETS_FILE,
scope=YOUTUBE_UPLOAD_SCOPE,
message=MISSING_CLIENT_SECRETS_MESSAGE)
storage = Storage("%s-oauth2.json" % sys.argv[0])
credentials = storage.get()
if credentials is None or credentials.invalid:
credentials = run_flow(flow, storage, args)
return build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION,
http=credentials.authorize(httplib2.Http()))
def initialize_upload(youtube, options):
tags = None
if options.keywords:
tags = options.keywords.split(",")
body=dict(
snippet=dict(
title=options.title,
description=options.description,
tags=tags,
categoryId=options.category
),
status=dict(
privacyStatus=options.privacyStatus
)
)
# Call the API's videos.insert method to create and upload the video.
insert_request = youtube.videos().insert(
part=",".join(body.keys()),
body=body,
media_body=MediaFileUpload(options.file, chunksize=-1, resumable=True)
)
resumable_upload(insert_request)
Make a file called: delete_video.py
Usage: python delete_video.py --id=MY_VID_ID
#!/usr/bin/python
import httplib
import httplib2
import os
import random
import sys
import time
from apiclient.discovery import build
from apiclient.errors import HttpError
from apiclient.http import MediaFileUpload
from oauth2client.client import flow_from_clientsecrets
from oauth2client.file import Storage
from oauth2client.tools import argparser, run_flow
# Explicitly tell the underlying HTTP transport library not to retry, since
# we are handling retry logic ourselves.
httplib2.RETRIES = 1
# Maximum number of times to retry before giving up.
MAX_RETRIES = 10
# Always retry when these exceptions are raised.
RETRIABLE_EXCEPTIONS = (httplib2.HttpLib2Error, IOError, httplib.NotConnected,
httplib.IncompleteRead, httplib.ImproperConnectionState,
httplib.CannotSendRequest, httplib.CannotSendHeader,
httplib.ResponseNotReady, httplib.BadStatusLine)
# Always retry when an apiclient.errors.HttpError with one of these status
# codes is raised.
RETRIABLE_STATUS_CODES = [500, 502, 503, 504]
# The CLIENT_SECRETS_FILE variable specifies the name of a file that contains
# the OAuth 2.0 information for this application, including its client_id and
# client_secret. You can acquire an OAuth 2.0 client ID and client secret from
# the Google Developers Console at
# https://console.developers.google.com/.
# Please ensure that you have enabled the YouTube Data API for your project.
# For more information about using OAuth2 to access the YouTube Data API, see:
# https://developers.google.com/youtube/v3/guides/authentication
# For more information about the client_secrets.json file format, see:
# https://developers.google.com/api-client-library/python/guide/aaa_client_secrets
CLIENT_SECRETS_FILE = "client_secrets.json"
# This OAuth 2.0 access scope allows an application to upload files to the
# authenticated user's YouTube channel, but doesn't allow other types of access.
YOUTUBE_DELETE_SCOPE = "https://www.googleapis.com/auth/youtube"
YOUTUBE_API_SERVICE_NAME = "youtube"
YOUTUBE_API_VERSION = "v3"
# This variable defines a message to display if the CLIENT_SECRETS_FILE is
# missing.
MISSING_CLIENT_SECRETS_MESSAGE = """
WARNING: Please configure OAuth 2.0
To make this sample run you will need to populate the client_secrets.json file
found at:
%s
with information from the Developers Console
https://console.developers.google.com/
For more information about the client_secrets.json file format, please visit:
https://developers.google.com/api-client-library/python/guide/aaa_client_secrets
""" % os.path.abspath(os.path.join(os.path.dirname(__file__),
CLIENT_SECRETS_FILE))
VALID_PRIVACY_STATUSES = ("public", "private", "unlisted")
def get_authenticated_service(args):
flow = flow_from_clientsecrets(CLIENT_SECRETS_FILE,
scope=YOUTUBE_DELETE_SCOPE,
message=MISSING_CLIENT_SECRETS_MESSAGE)
storage = Storage("%s-oauth2.json" % sys.argv[0])
credentials = storage.get()
if credentials is None or credentials.invalid:
credentials = run_flow(flow, storage, args)
return build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION,
http=credentials.authorize(httplib2.Http()))
if __name__ == '__main__':
argparser.add_argument("--id", required=True, help="Video youtube ID")
args = argparser.parse_args()
if not args.id:
exit("Please specify a youtube ID using the --id= parameter.")
youtube = get_authenticated_service(args)
try:
resp = youtube.videos().delete(id=args.id, onBehalfOfContentOwner=None).execute()
except HttpError, e:
print "An HTTP error %d occurred:\n%s" % (e.resp.status, e.content)
Assuming that you are using the python client library I found this in the documentation.
delete(id=*, onBehalfOfContentOwner=None) Deletes a YouTube video.
Args: id: string, The id parameter specifies the YouTube video ID
for the resource that is being deleted. In a video resource, the id
property specifies the video's ID. (required)
onBehalfOfContentOwner: string, Note: This parameter is intended
exclusively for YouTube content partners.