AWS Service Quota: How to get service quota for Amazon S3 using boto3 - amazon-web-services

I get the error "An error occurred (NoSuchResourceException) when calling the GetServiceQuota operation:" while trying running the following boto3 python code to get the value of quota for "Buckets"
client_quota = boto3.client('service-quotas')
resp_s3 = client_quota.get_service_quota(ServiceCode='s3', QuotaCode='L-89BABEE8')
In the above code, QuotaCode "L-89BABEE8" is for "Buckets". I presumed the value of ServiceCode for Amazon S3 would be "s3" so I put it there but I guess that is wrong and throwing error. I tried finding the documentation around ServiceCode for S3 but could not find it. I even tried with "S3" (uppercase 'S' here), "Amazon S3" but that didn't work as well.
What I tried?
client_quota = boto3.client('service-quotas') resp_s3 = client_quota.get_service_quota(ServiceCode='s3', QuotaCode='L-89BABEE8')
What I expected?
Output in the below format for S3. Below example is for EC2 which is the output of resp_ec2 = client_quota.get_service_quota(ServiceCode='ec2', QuotaCode='L-6DA43717')

I just played around with this and I'm seeing the same thing you are, empty responses from any service quota list or get command for service s3. However s3 is definitely the correct service code, because you see that come back from the service quota list_services() call. Then I saw there are also list and get commands for AWS default service quotas, and when I tried that it came back with data. I'm not entirely sure, but based on the docs I think any quota that can't be adjusted, and possibly any quota your account hasn't requested an adjustment for, will probably come back with an empty response from get_service_quota() and you'll need to run get_aws_default_service_quota() instead.
So I believe what you need to do is probably run this first:
client_quota.get_service_quota(ServiceCode='s3', QuotaCode='L-89BABEE8')
And if that throws an exception, then run the following:
client_quota.get_aws_default_service_quota(ServiceCode='s3', QuotaCode='L-89BABEE8')

Related

Elasticsearch 6.3 (AWS) snapshot restore progress ERROR: "/_recovery is not allowed"

I take manual snapshots of an Elasticsearch index
These are stored in a snapshot repo on S3
I have created a new ES cluster, also version 6.3
I have connected the new cluster to the S3 snapshot repo via python script method mentioned in this blog post: https://medium.com/docsapp-product-and-technology/aws-elasticsearch-manual-snapshot-and-restore-on-aws-s3-7e9783cdaecb
I have confirmed that the new cluster has access to the snapshot repo via the GET /_snapshot/manual-snapshot-repo/_all?pretty command
I have initiated a snapshot restore to this new cluster via:
POST /_snapshot/manual-snapshot-repo/snapshot_name/_restore
{
"indices": "reports",
"ignore_unavailable": false,
"include_global_state": false
}
It is clear that this operation has at least partially succeeded as the cluster status has gone from "green" to "yellow" and a GET request to /_cluster/health yields information that suggests actions are occuring on an otherwise empty cluster... not to mention storage is starting to be utilized (when viewing cluster health on AWS).
I would very much like to monitor the progress of the restore operation.
Elasticsearch docs suggest to use the Recovery API. Docs Link: https://www.elastic.co/guide/en/elasticsearch/reference/6.3/indices-recovery.html
It is clear from the docs that GET /_recovery?human or GET /my_index/_recovery?human should yield restore progress.
However, I encounter the following error:
"Message": "Your request: '/_recovery' is not allowed."
I get the same message when attempting the GET command in the following ways:
Via Kibana dev tools
Via chrome address bar (It's just a GET operation after all)
Via Advanced REST Client (a Chrome app)
I have not been able to locate any other mention of this particular error message.
How can I utilize the GET /_recovery?human command on my ElasticSearch 6.3 clusters?
Thank you!
The Amazon managed Elasticsearch does not have all the endpoints available.
For version 6.3 you can check this link for the endpoints available, and _recovery is not on the list, that is why you get that message.
Without the _recovery endpoint you will need to rely on _cluster/health.

AWS EMR - Hive creating new table in S3 results in AmazonS3Exception: Bad Request

I have a Hive script I'm running in EMR that is creating a partitioned Parquet table in S3 from a ~40GB gzipped CSV file also stored in S3.
The script runs fine for about 4 hours but reaches a point (pretty sure when it is just about done creating the Parquet table) where it errors out. The logs show that the error is:
HiveException: Hive Runtime Error while processing row
caused by:
AmazonS3Exception: Bad Request
There really isn't any more useful information in the logs that I can see. It is reading the CSV file fine from S3 and it creates a couple metadata files in S3 fine as well, so I've confirmed the instance has read/write permissions to the Bucket.
I really can't think of anything else that's going on and I wish there was more info in the logs about what "Bad Request" to S3 that Hive is making. Anyone have any ideas?
BadRequest is a fairly meaningless response from AWS which it sends if there is any reason why it doesn't like the caller. Nobody really knows what's happening.
The troubleshooting docs for the ASF S3A connector list some causes, but they aren't complete, and based on guesswork from what made the message go away.
If you have the request ID which failed, you can submit a support request for amazon to see what they saw on their side.
If it makes you feel any better, I'm seeing it when I try to list exactly one directory in an object store, and I'm co-author of the s3a connector. Like I said "guesswork". Once you find out, add a comment here or, if it's not in the troubleshooting doc, submit a patch to hadoop on the topic.

Apache Zeppelin with Athena handling session token using jdbc Interpreter

I am trying to connect Athena with Apache Zeppelin.I need to handle secret_key, Access_key, and Session_token. I am feeling hard to establish my connection with the Zeppelin JDBC interpreter.
I am following the steps as mentioned in this block,
If any one can help me out in establishing the connection with AWS Session token approach that would be helpful.
Thank You
The main docs for this are here:
https://docs.aws.amazon.com/athena/latest/ug/connect-with-jdbc.html
I found there are 2 driver versions, -1.1.0 and -1.0.1 . I could only get Zeppelin working with 1.1.0, and the links on that page don't go to that file, the only way to get it was using the aws s3 cp command
e.g.
aws s3 cp s3://athena-downloads/drivers/AthenaJDBC41-1.1.0.jar .
although I've given feedback on that page so it should be fixed soon.
Regarding the parameters, you use default.user and enter the Access_Key, default.password and enter the Secret_key. The default.driver should be com.amazonaws.athena.jdbc.AthenaDriver
The default.s3_staging_dir is actually the bucket where csv results are written so needs to match your athena settings.
There is no mention of where you might put a session token, however, you could always try putting it on the jdbc connection string ( which goes in default.url parameter value)
e.g.
jdbc:awsathena://athena.{REGION}.amazonaws.com:443?SessionToken=blahblahsomethingrealsessiontokengoeshere
but of course, replace {REGION} with the actual aws region and use your real session token.

AWS | Boto3 | RDS |function DownloadDBLogFilePortion |cannot download a log file because it contains binary data |

When I try to download all log files from a RDS instance, in some cases, I found this error in my python output:
An error occurred (InvalidParameterValue) when calling the
DownloadDBLogFilePortion operation: This file contains binary data and
should be downloaded instead of viewed.
I manage correctly the pagination and the throttling (using The Marker parameter and the sleep function).
This is my calling:
log_page=request_paginated(rds,DBInstanceIdentifier=id_rds,LogFileName=log,NumberOfLines=1000)
rds-> boto3 resource
And this is the definition of my function:
def request_paginated(rds,**kwargs):
return rds.download_db_log_file_portion(**kwargs)
Like I said, most of time this function works but sometime it returns:
"An error occurred (InvalidParameterValue) when calling the
DownloadDBLogFilePortion operation: This file contains binary data and
should be downloaded instead of viewed"
Can you help me please? :)
UPDATE: the problem is a known issue with downloading log files that contain non printable sign. As soon as possible I will try the proposed solution provide by the aws support
LATEST UPDATE: This is an extract of my discussion with aws support team:
There is a known issue with non binary characters when using the boto based AWS cli, however this issue is not present when using the older Java based cli.
There is currently no way to fix the issue that you are experiencing while using the boto based AWS cli, the workaround is to make the API call from the Java based cli
the aws team are aware of this issue and are working on a way to resolve this, however the do not have an ETA for when this will be released.
So the solutions is: use the java API
Giuseppe
LATEST UPDATE: This is an extract of my discussion with aws support team:
There is a known issue with non binary characters when using the boto based AWS cli, however this issue is not present when using the older Java based cli.
There is currently no way to fix the issue that you are experiencing while using the boto based AWS cli, the workaround is to make the API call from the Java based cli
the aws team are aware of this issue and are working on a way to resolve this, however the do not have an ETA for when this will be released. So the solutions is: use the java API
Giuseppe
http://docs.aws.amazon.com/AmazonRDS/latest/APIReference/CommonErrors.html
InvalidParameterValue : An invalid or out-of-range value was supplied
for the input parameter.
Invalid parameter in boto means the data pass does not complied. Probably an invalid name that you specified, possible something wrong with your variable id_rds, or maybe your LogFileName, etc. You must complied with the function arguments requirement.
response = client.download_db_log_file_portion(
DBInstanceIdentifier='string',
LogFileName='string',
Marker='string',
NumberOfLines=123
)
(UPDATE)
For example, LogFileName must be te exact file name exist inside RDS instance.
For the logfile , please make sure the log file EXISTS inside the instance. Use this AWS CLI to get a quick check
aws rds describe-db-log-files --db-instance-identifier <my-rds-name>
Do check Marker (string) and NumberOfLines (Integer) as well. Mismatch type or out of range. Skip them since they are not required, then test it later.

InvalidSignatureException when using boto3 for dynamoDB on aws

Im facing some sort of credentials issue when trying to connect to my dynamoDB on aws. Locally it all works fine and I can connect using env variables for AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_DEFAULT_REGION and then
dynamoConnection = boto3.resource('dynamodb', endpoint_url='http://localhost:8000')
When changing to live creds in the env variables and setting the endpoint_url to the dynamoDB on aws this fails with:
"botocore.exceptions.ClientError: An error occurred (InvalidSignatureException) when calling the Query operation: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details."
The creds are valid as they are used in a different app which talks to the same dynamoDB. Ive also tried not using env variables but rather directly in the method but the error persisted. Furthermore, to avoid any issues with trailing spaces Ive even used the credentials directly in the code. Im using Python v3.4.4.
Is there maybe a header that also should be set that Im not aware of? Any hints would be apprecihated.
EDIT
Ive now also created new credentials (to make sure there are only alphanumerical signs) but still no dice.
You shouldn't use the endpoint_url when you are connecting to the real DynamoDB service. That's really only for connecting to local services or non-standard endpoints. Instead, just specify the region you want:
dynamoConnection = boto3.resource('dynamodb', region_name='us-west-2')
It sign that your time zone is different. Maybe you can check your:
1. Time zone
2. Time settings.
If there are some automatic settings, you should fix your time settings.
"sudo hwclock --hctosys" should do the trick.
Just wanted to point out that accessing DynamoDB from a C# environment (using AWS.NET SDK) I ran into this error and the way I solved it was to create a new pair of AWS access/secret keys.
Worked immediately after I changed those keys in the code.