django-cumulus: retrieve PIL.Image object from django.db.models.ImageField - django

I'm using django-cumulus to store my media on Rackspace cloud.
I need to retrieve data from ImageField to PIL.Image. I need it to make some changes on this image (cropping, filters, etc.) and save it to another cumulus ImageField.
I tried this code:
def field_to_image(field):
# field - cumulus-powered ImageField on some model
from StringIO import StringIO
from PIL import Image
r = field.read() # ERROR throws here!
image = Image.open(StringIO(r))
return image
It worked good on half of my files, but on the other half I'm always getting this error:
Traceback (most recent call last):
File "tmp.py", line 78, in <module>
resize_photos(start)
File "tmp.py", line 59, in resize_photos
photo.make_thumbs()
File "/hosting/site/news/models.py", line 65, in make_thumbs
i = functions.field_to_image(self.img)
File "/hosting/site/functions.py", line 169, in field_to_image
r = field.read()
File "/usr/local/lib/python2.7/dist-packages/cumulus/storage.py", line 352, in read
if self._pos == self._get_size() or chunk_size == 0:
File "/usr/local/lib/python2.7/dist-packages/cumulus/storage.py", line 322, in _get_size
self._size = self._storage.size(self.name)
File "/usr/local/lib/python2.7/dist-packages/cumulus/storage.py", line 244, in size
return self._get_object(name).total_bytes
AttributeError: 'bool' object has no attribute 'total_bytes'
Can anyone help me? Maybe there is the better way to retrieve PIL.Image object from rackspace?
The file I'm trying to read() exists and is available via url on Rackspace

It returns False if the file is not found in the container, hence is a very confusing error.
It is fixed now in repo, but still not in the released version: it returns None instead of False:
https://github.com/django-cumulus/django-cumulus/blob/master/cumulus/storage.py#L203
But the basic cause of the problem: the file is not found.

Related

Invoke sagemaker endpoint with custom inference script

I've deployed a sagemaker endpoint using the following code:
from sagemaker.pytorch import PyTorchModel
from sagemaker import get_execution_role, Session
sess = Session()
role = get_execution_role()
model = PyTorchModel(model_data=my_trained_model_location,
role=role,
sagemaker_session=sess,
framework_version='1.5.0',
entry_point='inference.py',
source_dir='.')
predictor = model.deploy(initial_instance_count=1,
instance_type='ml.m4.xlarge',
endpoint_name='my_endpoint')
If I run:
import numpy as np
pseudo_data = [np.random.randn(1, 300), np.random.randn(6, 300), np.random.randn(3, 300), np.random.randn(7, 300), np.random.randn(5, 300)] # input data is a list of 2D numpy arrays with variable first dimension and fixed second dimension
result = predictor.predict(pseudo_data)
I can generate the result with no errors. However, if I want to invoke the endpoint and make prediction by running:
from sagemaker.predictor import RealTimePredictor
predictor = RealTimePredictor(endpoint='my_endpoint')
result = predictor.predict(pseudo_data)
I'd get an error:
Traceback (most recent call last):
File "default_local.py", line 77, in <module>
score = predictor.predict(input_data)
File "/home/biggytruck/.local/lib/python3.6/site-packages/sagemaker/predictor.py", line 113, in predict
response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
File "/home/biggytruck/.local/lib/python3.6/site-packages/botocore/client.py", line 316, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/biggytruck/.local/lib/python3.6/site-packages/botocore/client.py", line 608, in _make_api_call
api_params, operation_model, context=request_context)
File "/home/biggytruck/.local/lib/python3.6/site-packages/botocore/client.py", line 656, in _convert_to_request_dict
api_params, operation_model)
File "/home/biggytruck/.local/lib/python3.6/site-packages/botocore/validate.py", line 297, in serialize_to_request
raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid type for parameter Body
From my understanding, the error occurs because I didn't pass in inference.py as the entry point file, which is required to handle the input since it's not in a standard format supported by Sagemaker. However, sagemaker.predictor.RealTimePredictor doesn't allow me to define the entry point file. How can I solve this?
The error you're seeing is raised from the clientside SageMaker Python SDK library, not the remote endpoint that you have published.
Here is the documentation for the data argument (in your case, this is pseudo_data)
data (object) – Input data for which you want the model to provide inference. If a serializer was specified when creating the RealTimePredictor, the result of the serializer is sent as input data. Otherwise the data must be sequence of bytes, and the predict method then sends the bytes in the request body as is.
Source: https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html#sagemaker.predictor.RealTimePredictor.predict
My guess is that pseudo_data is not the type that the SageMaker Python SDK is expecting, which is a sequence of bytes.

PyMongo 3 and ServerSelectionTimeoutError while getting data from Mongodb

This seems like an old solved problem here and here and here but Still I am getting this error.I create my db on Docker.And It worked only one time.Before this, I re-created db, did "connect =false",added wait, downgraded pymongo, did previous solutions etc. I stuck.
Python 3.8.0, Pymongo 3.9.0
from pymongo import MongoClient
import pprint
client = MongoClient('mongodb://192.168.1.100:27017/',
username='admin',
password='psw',
authSource='myappdb',
authMechanism='SCRAM-SHA-1',
connect=False)
db = client['myappdb']
serverStatusResult=db.command("serverStatus")
pprint(serverStatusResult)
and I am getting ServerSelectionTimeoutError
Traceback (most recent call last):
File "C:\Users\ME\eclipse2019-workspace\exdjango\exdjango__init__.py",
line 12, in
serverStatusResult=db.command("serverStatus")
File "C:\Users\ME\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pymongo\database.py",
line 610, in command
with self.client._socket_for_reads(
File "C:\Users\ME\AppData\Local\Programs\Python\Python38-32\lib\contextlib.py",
line 113, in __enter
return next(self.gen)
File "C:\Users\ME\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pymongo\mongo_client.py",
line 1099, in _socket_for_reads
server = topology.select_server(read_preference)
File "C:\Users\ME\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pymongo\topology.py",
line 222, in select_server
return random.choice(self.select_servers(selector,
File "C:\Users\ME\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pymongo\topology.py",
line 182, in select_servers
server_descriptions = self._select_servers_loop(
File "C:\Users\ME\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pymongo\topology.py",
line 198, in _select_servers_loop
raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: 192.168.1.100:27017: timed out
Your connection looks a little misconfigured. Firstly you have half connection string, half parameter format. I'd suggest you stick with one or the other.
Your auth database is usually seperate to your actual databases (and it's usually called admin). Check this is correct.
There's no particular need to specify the authMechanism assuming you are using MongoDB 3.0 or later.
The connect=False is likely a red herring.
So I would try one of either:
client = MongoClient('mongodb://admin:psw#192.168.1.100:27017/myappdb?authSource=admin')
or
client = MongoClient(host='192.168.1.100',
port=27017,
username='admin',
password='psw',
authSource='admin')

Google Vision Python 2.7 TypeError: construct_settings() got an unexpected keyword argument 'metrics_headers'

After installing the required packages using pip, downloading a Json key and setting the enviroment variable in the cmd window with: set GOOGLE_APPLICATION_CREDENTIALS = 'C:\Users\ xxx .json' and following the instructions to use the Google Vision API on https://googlecloudplatform.github.io/google-cloud-python/stable/vision-usage.html#authentication-and-configuration
I tried the following and got the following error without any idea how to solve the error, so all suggestions are much appreciated
>>> from google.cloud import vision
>>> client =vision.Client()
>>> print client
<google.cloud.vision.client.Client object at 0x08D414F0>
>>> image = client.image(filename='test2.jpg')
>>> print image
<google.cloud.vision.image.Image object at 0x0CBF68F0>
>>> text = image.detect_text()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\google\cloud\vision\image.py", line 289, in detect_text
annotations = self.detect(features)
File "C:\Python27\lib\site-packages\google\cloud\vision\image.py", line 143, in detect
return self._detect_annotation(images)
File "C:\Python27\lib\site-packages\google\cloud\vision\image.py", line 117, in _detect_annotation
return self.client._vision_api.annotate(images)
File "C:\Python27\lib\site-packages\google\cloud\vision\client.py", line 114, in _vision_api
self._vision_api_internal = _GAPICVisionAPI(self)
File "C:\Python27\lib\site-packages\google\cloud\vision\_gax.py", line 34, in __init__
lib_version=__version__)
File "C:\Python27\lib\site-packages\google\cloud\gapic\vision\v1\image_annotator_client.py", line 140, in __init__
metrics_headers=metrics_headers, )
TypeError: construct_settings() got an unexpected keyword argument 'metrics_headers'

UnicodeDecodeError: in python 2.7

I am working with the VirusTotal api, exactly with this:
https://www.virustotal.com/es/documentation/public-api/#scanning-files
This is the part of my scritp where im having problems:
def scanAFile(fileToScan):
host = "www.virustotal.com"
selector = "https://www.virustotal.com/vtapi/v2/file/scan"
fields = [("apikey", myPublicKey)]
file_to_send = open(fileToScan, "rb").read()
files = [("file", fileToScan, file_to_send)]
json = postfile.post_multipart(host, selector, fields, files)
return simplejson.loads(json)
With some files there is not any error and it runs fine, but the next error is occurring when trying to scan some files, for example this error is for a jpg file:
Traceback (most recent call last):
File "F:/devPy/myProjects/script_vt.py", line 138, in <module>
scanMyFile()
File "F:/devPy/myProjects/script_vt.py", line 75, in scanQueue
jsonScan = scanAFile(fileToScan)
File "F:/devPy/myProjects/script_vt.py", line 37, in scanAFile
json = postfile.post_multipart(host, selector, fields, files)
File "F:\devPy\myProjects\script_vt.py", line 10, in post_multipart
content_type, body = encode_multipart_formdata(fields, files)
File "F:\devPy\myProjects\script_vt.py", line 42, in encode_multipart_formdata
body = CRLF.join(L)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128)
I should indicate I work with pycharm under windows, could this cause the encoding error?
Any idea how to solve it? I got stack and couldnt find any solution on the net.

Solr issues with searching

I was using Apache Solr for quite some time and only recently started running into some severe issues with it. I'm using it with haystack and a django project. When I do it from manage.py shell i'm getting the below:
>>> from haystack.query import SearchQuerySet
>>> emps = SearchQuerySet().filter(django_ct='web.employer').filter(name__icontains='Mi')[:10]
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/haystack/query.py", line 241, in __getitem__
self._fill_cache(start, bound)
File "/usr/local/lib/python2.7/dist-packages/haystack/query.py", line 140, in _fill_cache
results = self.query.get_results(**kwargs)
File "/usr/local/lib/python2.7/dist-packages/haystack/backends/__init__.py", line 469, in get_results
self.run(**kwargs)
File "/usr/local/lib/python2.7/dist-packages/haystack/backends/solr_backend.py", line 501, in run
results = self.backend.search(final_query, **search_kwargs)
File "/usr/local/lib/python2.7/dist-packages/haystack/backends/__init__.py", line 47, in wrapper
return func(obj, query_string, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/haystack/backends/solr_backend.py", line 202, in search
raw_results = self.conn.search(query_string, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 578, in search
response = self._select(params)
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 308, in _select
return self._send_request('get', path)
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 293, in _send_request
error_message = self._extract_error(resp)
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 372, in _extract_error
reason, full_html = self._scrape_response(resp.headers, resp.content)
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 404, in _scrape_response
p_nodes = body_node.cssselect('p')
AttributeError: 'NoneType' object has no attribute 'cssselect'
I tried reinstalling haystack, lxml, cssselect, pysolr and still i'm getting these errors. Is there anything else I can try for this? Thanks for any help!
I also tried reading few other SO questions including this:
XML error object has no attribute 'cssselect'
Seems like the issue is with pysolr. You might find some help here.
I had the same issue persist even after bringing up pysolr and lxml to latest version.
Turned out it was because I was not using haystack generated schema which has a few additional fields compared to the default solr one.
You can confirm if this is the case by looking at your solr logs.
It is an issue with pysolr. It hasn't been fixed till 3.3.0.
The only alternative would be to override the pysolr code and make adjustments for when Solr returns a reponse status!=200.
You can check if the response has body attribute or not and make adjustments according to that.