Boto3 Upload Issues - amazon-web-services

Boto3 Upload Issues - amazon-web-services

I have a very strange issue with uploading to S3 from Boto. In our (Elastic-Beanstalk-)deployed instances, we have no problems uploading to S3, and other developers with the same S3 credentials also have no issues. However, when locally testing, using the same Dockerfile, I can upload files up to exactly 1391 bytes, but anything 1392 bytes and above just gives me a connection that times out and retries a few times.
2018-03-27 18:14:34 botocore.vendored.requests.packages.urllib3.connectionpool INFO Starting new HTTPS connection (1): xxx.s3.amazonaws.com
2018-03-27 18:14:34 botocore.vendored.requests.packages.urllib3.connectionpool INFO Starting new HTTPS connection (1): xxx.s3.xxx.amazonaws.com
2018-03-27 18:15:14 botocore.vendored.requests.packages.urllib3.connectionpool INFO Resetting dropped connection: xxx.s3.xxx.amazonaws.com
I've tried this with every variant of uploading to S3 from Boto, including boto3.resource('s3').meta.client.upload_file, boto3.resource('s3').meta.client.upload_fileobj, and boto3.resource('s3').Bucket('xxx').put_object.
Any ideas what could be wrong here?

Related

Resolving intermittent 502 Bad Gateway with Cloudfront pulling from S3

We have an AWS production setup with pieces including EC2, S3, and Cloudfront (among others). The website on EC2 generates XML feeds which includes a number of images for each item (in total, over 300k images). The XML feed is consumed by a third-party, which processes the feed and downloads any new images. All image links point to Cloudfront with an S3 bucket as its origin.
Reviewing third-party logs, many of those images are successfully downloaded. But there are still many images that fail. They are getting 502 Bad Gateway messages. Looking at Cloudfront logs, all I'm seeing is OriginError with no indication of what's causing the error. Most discussions I've found about Cloudfront 502 errors point to SSL issues and seem to be people getting a 502 with every request. SSL isn't a factor here, and most requests successfully process, so it's an intermittent issue - and I haven't been able to manually replicate the issue.
I'm suspecting something with S3 rate limiting, but even with that many images, I don't think the third-party is grabbing images anywhere near fast enough to trigger rate limiting. But I could be wrong. Either way, I can't figure out what's causing the issue - so I can't figure out how to fix it - since I'm not getting a more specific error from S3/CloudFront. Below is one row from the Cloudfront log, broken down.
ABC.2021-10-21-21.ABC:2021-10-21
21:09:47
DFW53-C1
508
ABCIP
GET
ABC.cloudfront.net
/ABC.jpg
502
-
ABCUA
-
-
Error
ABCID
ABC.cloudfront.net
https
294
4.045
-
TLSv1.2
ECDHE-RSA-AES128-GCM-SHA256
Error
HTTP/1.1
-
-
11009
4.045
OriginError
application/json
36
-
-

How can I use automation in AWS to replicate a github repo to an S3 bucket (quickstart-git2s3)?

I'd like to try and automate an S3 bucket replication of a Github repo (for the sole reason that Cloudformation modules must reference templates in S3).
This quickstart I tried to use looked like it could do it, but it doesn't result in success for me, even though github reports success in pushing via the webhook for my repository.
https://aws-quickstart.github.io/quickstart-git2s3/
I configured these parameters.
I am not sure what to configure for allowed IP's, so I tested fully open.
AllowedIps 0.0.0.0/0 -
ApiSecret **** -
CustomDomainName - -
ExcludeGit True -
OutputBucketName - -
QSS3BucketName aws-quickstart -
QSS3BucketRegion us-east-1 -
QSS3KeyPrefix quickstart-git2s3/ -
ScmHostnameOverride - -
SubnetIds subnet-124j124 -
VPCCidrRange 172.31.0.0/16 -
VPCId vpc-l1kj4lk2j1l2k4j
I tried manually executing the code build as well but got this error:
COMMAND_EXECUTION_ERROR: Error while executing command: python3 - << "EOF" from boto3 import client import os s3 = client('s3') kms = client('kms') enckey = s3.get_object(Bucket=os.getenv('KeyBucket'), Key=os.getenv('KeyObject'))['Body'].read() privkey = kms.decrypt(CiphertextBlob=enckey)['Plaintext'] with open('enc_key.pem', 'w') as f: print(privkey.decode("utf-8"), file=f) EOF . Reason: exit status 1
The github webhook page reports this response:
Headers
Content-Length: 0
Content-Type: application/json
Date: Thu, 24 Jun 2021 21:33:47 GMT
Via: 1.1 9b097dfab92228268a37145aac5629c1.cloudfront.net (CloudFront)
X-Amz-Apigw-Id: 1l4kkn14l14n=
X-Amz-Cf-Id: 1l43k135ln13lj1n3l1kn414==
X-Amz-Cf-Pop: IAD89-C1
X-Amzn-Requestid: 32kjh235-d470-1l412-bafa-l144l1
X-Amzn-Trace-Id: Root=1-60d4fa3b-73d7403073276ca306853b49;Sampled=0
X-Cache: Miss from cloudfront
Body
{}

From the following link:
https://aws-quickstart.github.io/quickstart-git2s3/
You can see the following excerpts I have included:
Allowed IP addresses (AllowedIps)
18.205.93.0/25,18.234.32.128/25,13.52.5.0/25
Comma-separated list of allowed IP CIDR blocks. The default addresses listed are BitBucket Cloud IP ranges.
As such, since you said you're using GitHub, I believe you should use this URL to determine the IP range:
https://api.github.com/meta
As that API will respond with JSON, you should search for the attribute hooks since I believe that it described using hooks.

Why don't you copy/checkout the file you want before you run your cloudformation commands, no reason to get too fancy.
git checkout -- path/to/some/file
aws cloudformation ...
Otherwise why not fork the repo and add your Cloudformation stuff, and then it's all there. You could also delete everything you didn't need and merge/pull changes in the future. That way your deploys will be reproducible, and you can rollback easily from one commit to the other.

AWS S3 Download iOS - The request timed out

I'm downloading around 400 files asynchronously in my iOS app using Swift from my bucket in Amazon S3, but sometimes i get this error for several of these files. The maximum file size is around 4 MBs, and the minimum is few KBs
Error is Optional(Error Domain=NSURLErrorDomain Code=-1001 "The request timed out." UserInfo={NSUnderlyingError=0x600000451190 {Error Domain=kCFErrorDomainCFNetwork Code=-1001 "(null)" UserInfo={_kCFStreamErrorCodeKey=-2102, _kCFStreamErrorDomainKey=4}}, NSErrorFailingURLStringKey=https://s3.us-east-2.amazonaws.com/mybucket/folder/file.html, NSErrorFailingURLKey=https://s3.us-east-2.amazonaws.com/mybucket/folder/file.html, _kCFStreamErrorDomainKey=4, _kCFStreamErrorCodeKey=-2102, NSLocalizedDescription=The request timed out.})
How can I prevent it?

Try to increase timeout:
let urlconfig = URLSessionConfiguration.default
urlconfig.timeoutIntervalForRequest = 300 // 300 seconds

Does boto2 use http or https to upload files to s3?

I noticed that uploading small files to S3 bucket is very slow. For a file with size of 100KB, it takes 200ms to upload. Both the bucket and our app are in Oregon. App is hosted on EC2.
I googled it and found some blogs; e.g. http://improve.dk/pushing-the-limits-of-amazon-s3-upload-performance/
It's mentioned that http can bring much speed gain than https.
We're using boto 2.45; I'm wondering whether both uses https or http by default? Or is there any param to configure this behavior in boto?
Thanks in advance!

The boto3 client includes a use_ssl parameter:
use_ssl (boolean) -- Whether or not to use SSL. By default, SSL is used. Note that not all services support non-ssl connections.
Looks like it's time for you to move to boto3!

I tried boto3, which has a nice parameter "use_ssl" in connection constructor. However, it turned out that boto3 is significantly slower than boto2.... there're actually already many posts online about this issue.
Finally, I found that, in boto2, there's also a similar param "is_secure"
self.s3Conn = S3Connection(config.AWS_ACCESS_KEY_ID, config.AWS_SECRET_KEY, host=config.S3_ENDPOINT, is_secure=False)
Setting is_secure to False saves us about 20ms. Not bad..........

Google Container Registry - max image size?

Tried googling and reading their documentation, but I cannot find what is the larges image size they support? I have an 15.7GB image, and I cannot upload it to Container registry:
gcloud docker -- push eu.gcr.io/XXXXXX/YYYYYY:ZZZZZZ
The push refers to a repository [eu.gcr.io/XXXXXX/YYYYYY]
5efa92011d99: Retrying in 1 second
8bac40556b9d: Retrying in 8 seconds
e4990dfff478: Retrying in 14 seconds
9f8566ee5135: Retrying in 10 seconds
unknown: Bad Request.

Please contact us with the un-redacted image at gcr-contact#google.com
In general (this may not be an issue for you), the problem with large images is that the short-lived access tokens you receive via our normal token exchange will result in failed uploads. You're going to have to explore JSON key authentication in order to enable those very long sessions when uploading your images to GCS: https://cloud.google.com/container-registry/docs/advanced-authentication

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js