SSL validation failing when calling AWS API command - amazon-web-services

When running the following from my laptop:
import sys, os, boto3, json
payload = json.dumps({'query':query})
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('table1')
response = table.get_item(Key={'item': 'item1'})
I get the error:
botocore.exceptions.SSLError: SSL validation failed for https://dynamodb.eu-west-2.amazonaws.com/ [Errno 2] No such file or directory
It crashes on the last line with the "table.get_item(". This error appeared today for the first time. Prior to that, the same code had been running fine on the same laptop for the past two years. The same code still runs fine in AWS Lambda. The code is in Python 2.7.
I have been trying to resolve this issue for the past six hours. I have reinstalled boto3, botocore, awscli. Reconfigured aws cli. Updated pip and all the modules I use in Python.
Any help would be appreciated. Thank you.

Related

Unable to connect to Huggingface from EC2 instance

I am running a python code in EC2 instance where I am loading a Huggingface model using the from_pretrained() method. I get the error
OSError: Couldn't reach server at 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json' to download pretrained model configuration file.
while trying to initialize the reader. To get over this, I downloaded the file manually and provided the local JSON path. That worked fine but then I see issues in loading the tokenizer too.
OSError: Couldn't reach server at '{}' to download vocabulary files.
I think my network settings of EC2 are not correct due to which I am unable to connect to external Huggingface repository.
I tried relaxing the inbound rules for EC2 to IP version|Type|Protocol|Port range|Destination=>IPv4|All|traffic|All|All|0.0.0.0/0 but even that doesn't help. The outbound rules are already IPv4|All|traffic|All|All|0.0.0.0/0.
I also tried creating an IAM role with policy AmazonS3ReadOnlyAccess and attached it to the EC2 instance but still getting the same error.
Could someone point what needs to be done to solve this. Thanks.
Here is how i fixed this issue.
i installed pyopenssl like this :
!pip install pyopenssl
then i restarted terminal and re-ran the code and it fixed the issue for me,thanks
might be your network is using proxy
this might help
$ proxies = {"http": 'foo.bar:3128', addyourproxy:'foo.bar:4012'}
$ from transformers import pipeline
$ qt_ans = pipeline('question-answering')

AWS EMR pyspark notebook fails with `Failed to run command /usr/bin/virtualenv (...)`

I have created a basic EMR cluster in AWS, and I'm trying to use the Jupyter Notebooks provided through the AWS Console. Launching the notebooks seems to work fine, and I'm also able to run basic python code in notebooks started with the pyspark kernel. Two variables are set up in the notebook: spark is a SparkSession instance, and sc is a SparkContext instance. Displaying sc yields <SparkContext master=yarn appName=livy-session-0> (the output can of course vary slightly depending on the session).
The problem arises once I perform operations that actually hit the spark machinery. For example:
sc.parallelize(list(range(10))).map(lambda x: x**2).collect()
I am no spark expert, but I believe this code should distribute the integers from 0 to 9 across the cluster, square them, and return the results in a list. Instead, I get a lengthy stack trace, mostly from the JVM, but also some python components. I believe the central part of the stack trace is the following:
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 4.0 failed 4 times, most recent failure: Lost task 0.3 in stage 4.0 (TID 116, ip-XXXXXXXXXXXXX.eu-west-1.compute.internal, executor 17): java.lang.RuntimeException: Failed to run command: /usr/bin/virtualenv -p python3 --system-site-packages virtualenv_application_1586243436143_0002_0
The full stack trace is here.
A bit of digging in the AWS portal led me to log output from the nodes. stdout from one of the nodes includes the following:
The path python3 (from --python=python3) does not exist
I tried running the /usr/bin/virtualenv command on the master node manually (after logging in through), and that worked fine, but the error is of course still present after I did that.
While this error occurs most of the time, I was able to get this working in one session, where I could run several operations against the spark cluster as I was expecting.
Technical information on the cluster setup:
emr-6.0.0
Applications installed are "Ganglia 3.7.2, Spark 2.4.4, Zeppelin 0.9.0, Livy 0.6.0, JupyterHub 1.0.0, Hive 3.1.2". Hadoop is also included.
3 nodes (one of them as master), all r5a.2xlarge.
Any ideas what I'm doing wrong? Note that I am completely new to EMR and Spark.
Edit: Added the stdout log and information about running the virtualenv command manually on the master node through ssh.
I have switched to using emr-5.29.0, which seems to resolve the problem. Perhaps this is an issue with emr-6.0.0? In any case, I have a functional workaround.
The issue for me was that the virtualenv was being made on the executors with a python path that didn't exist. Pointing the executors to the right one did the job for me:
"spark.pyspark.python": "/usr/bin/python3.7"
Here is how I reconfiged the spark app at the beginning of the notebook:
{"conf":{"spark.pyspark.python": "/usr/bin/python3.7",
"spark.pyspark.virtualenv.enabled": "true",
"spark.pyspark.virtualenv.type": "native",
"spark.pyspark.virtualenv.bin.path":"/usr/bin/virtualenv"}
}

appcfg.py request_logs certificate verify failed (_ssl.c:661)

We've been using appcfg.py request_logs to download GAE logs, every once in a while it throws the error:
httplib2.SSLHandshakeError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:661)
But after a few times trying it works out, sometimes also it works after updating gcloud using gcloud components update. We thought it might be some network throttling issue of some kind and didn't give it enough thought. Lately though, we're trying to figure out what is causing this.
The full command we use is:
appcfg.py request_logs -A testapp --version=20180321t073239 --severity=0 all_logs.log --append --no_cookies
It seems the error is related to httplib2 library, but since it is part of the appcfg.py calls we're not sure we should tamper with something within its calls
Versions:
Python 2.7.13
Google Cloud SDK 196.0.0
app-engine-python 1.9.67
This has become more persistent now and I couldn't download logs for a few days now no matter how many times I try.
Looking at the download logs command I tried the same command again but without the --no_cookies flag to see what would happen.
appcfg.py request_logs -A testapp --version=20180321t073239 --severity=0 all_logs.log --append
I got the error:
Error 403: --- begin server output ---
You do not have permission to modify this app (app_id=u'e~testapp').
--- end server output ---
Which lead me to the answer provided here https://stackoverflow.com/a/34694577/1394228 by #ninjahoahong. This worked for me and logs where downloaded from first trial in case someone faces the same issue
There's also this Google Group post which I didn't try but seems like it does the same thing.
Not sure if removing the file ~/.appcfg_oauth2_tokens would have other effects, yet to find out.
Update:
I also found out that my httplib2 located at /Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/httplib2 was version = "0.7.5", I upgraded it to version = '0.11.3' using target location(directory) upgrade command:
sudo pip2 install --upgrade httplib2 -t /Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/httplib2/

Deployment of Django app to AWS Lambda using zappa fails even though Zappa says your app is live at the following link

I came across the amazing serverless AWS Lambda recently and thought it would be great to have my app up there and not have to worry about auto scaling, load balancing and all for apparently a fraction of the cost.
So then I found out about Zappa which takes care of deploying your python app to AWS Lambda for you. Amazing is what I thought.
Its actually on paper very easy to do. just follow the instructions here..
https://github.com/Miserlou/Zappa
Anyway I followed the instructions with just a very basic django app using virtualenv that just contained the django rest framework tutorial in it..
Tested it locally and works fine.
Then I set up my s3 bucket and authenticated my credentials with the awscli.
Then I ran the 2 thing you need to deply.
Zappa init,
Zappa deploy dev.
Then it went through all its processes, packaging into zip, deploying etc...
Then at the end it said your app is live and here is the url
It gave me a url to try.
I pasted the url into the browser and this is what the browser displayed for me.
Oh yeah and my s3 bucket is still empty and so is my aws lambda service.
{
"message": "An uncaught exception happened while servicing this request.",
"traceback": [
"Traceback (most recent call last):\n",
" File \"/var/task/handler.py\", line 395, in handler\n response = Response.from_app(self.wsgi_app, environ)\n",
" File \"/home/donagh/projects/vizzydev/vizzy/visualid/vizzy_django/env/build/Werkzeug/werkzeug/wrappers.py\", line 865, in from_app\n",
" File \"/home/donagh/projects/vizzydev/vizzy/visualid/vizzy_django/env/build/Werkzeug/werkzeug/wrappers.py\", line 57, in _run_wsgi_app\n",
" File \"/home/donagh/projects/vizzydev/vizzy/visualid/vizzy_django/env/build/Werkzeug/werkzeug/test.py\", line 871, in run_wsgi_app\n",
"TypeError: 'NoneType' object is not callable\n"
]
}
If anyone as any ideas I would be very grateful. I would love to get this working. It would be an incredibly powerful resource.
When I get errors related to werkzeug wrapper it is usually because my packages were not installed in my virtual environment.
virtualenv venv
source venv/bin/activate
pip install Django
pip install zappa
# pip install any other packages
# or with a requirements.txt file
pip install -r requirements.txt
Then run the zappa deploy commands.

Error: "The security token included in the request is invalid" when using Boto python

I am getting this error when I run my django project on nginx. I use dynamodb for database and S3 for serving static files in the project. The project runs fine when operated on localhost.
The project originally was hosted in another ec2 instance where it ran like charm. I fired up a new ec2 instance from an image of that instance. And now it is throwing this error.
The thing is,
the connection works fine when I run some test code on the command
line.
But throws this error when the project runs.
JSONResponseError at /
JSONResponseError: 400 Bad Request
{u'message': u'The security token included in the request is invalid.', u'__type': u'com.amazon.coral.service#UnrecognizedClientException'}
Request Method: POST
Request URL: http://ec2-54-200-144-115.us-west-2.compute.amazonaws.com/?attempt=1&code=AQBfOzPR4Hlgrpkjz-qXQj8b7OLq6cm1NM_oZf64Wz3EmlX2-VDS6qfZ5V5f0Tmbx4MrLc4SGuJxUHa8drQClz3A1IWMVqUGKLEEW_0ol1RqClI8cZViWreBm5c3HJ-Vp48Xx81a7gvXSjRNJUn-kazXqahDrgsAeLez_8FrXIb_HWHyekhnUmxgkskRGBNzcTtpqASNe3agzG3ZZowCMYi6bDBAdVuODli3ApWQWENSmjLaN5QbZWbGo3ATvJNMAUQjj6VTHCkVS-UWcuh-PtwAAFtUqb8HkLsbFG31KevwPKz6x10ojD45pe03zA1SF_g
Django Version: 1.5
Exception Type: JSONResponseError
Exception Value:
JSONResponseError: 400 Bad Request
{u'message': u'The security token included in the request is invalid.', u'__type': u'com.amazon.coral.service#UnrecognizedClientException'}
Exception Location: /srv/www/test/local/lib/python2.7/site-packages/boto/dynamodb2/layer1.py in _retry_handler, line 1530
Python Executable: /srv/www/test/bin/python
Python Version: 2.7.3
Can't understand what's going on. Can anyone help me?
You are getting this message since the security credentials / Access Keys for your AWS account has been changed.
Try again with new access keys it will work.
All the best.
Are you using a newer version of python 2.7? I had a similar error with this because of the ssl fixes.
I was doing this and it worked
import ssl
# for fix to python on mac which is newer than the one on linux
ssl._create_default_https_context = ssl._create_unverified_context
because none of the solutions worked here was the solution for me.
on a mean stack, I had to run as sudo
sudo forever start startme.js
or
sudo node startme.js
when i did NOT use 'sudo', I received the error:
UnrecognizedClientException: The security token included in the request is invalid.