I am trying to pass boto3 a list of bucket names and have it first enable versioning on each bucket, then enable a lifecycle policy on each.
I have done aws configure, and do have two profiles, both current, active user profiles with all necessary permissions. The one I want to use is named "default."
import boto3
# Create session
s3 = boto3.resource('s3')
# Bucket list
buckets = ['BUCKET-NAME']
# iterate through list of buckets
for bucket in buckets:
# Enable Versioning
bucketVersioning = s3.BucketVersioning('bucket')
bucketVersioning.enable()
# Current lifecycle configuration
lifecycleConfig = s3.BucketLifecycle(bucket)
lifecycleConfig.add_rule={
'Rules': [
{
'Status': 'Enabled',
'NoncurrentVersionTransition': {
'NoncurrentDays': 7,
'StorageClass': 'GLACIER'
},
'NoncurrentVersionExpiration': {
'NoncurrentDays': 30
}
}
]
}
# Configure Lifecycle
bucket.configure_lifecycle(lifecycleConfig)
print "Versioning and lifecycle have been enabled for buckets."
When I run this I get the following error:
Traceback (most recent call last):
File "putVersioning.py", line 27, in <module>
bucketVersioning.enable()
File "/usr/local/lib/python2.7/dist-packages/boto3/resources/factory.py", line 520, in do_action
response = action(self, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/boto3/resources/action.py", line 83, in __call__
response = getattr(parent.meta.client, operation_name)(**params)
File "/home/user/.local/lib/python2.7/site-packages/botocore/client.py", line 253, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/user/.local/lib/python2.7/site-packages/botocore/client.py", line 557, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the PutBucketVersioning operation: Access Denied
My profiles has full privileges, so that shouldn't be a problem. Is there something else I need to do for passing credentials? Thanks everyone!
To set the versioning state, you must be the bucket owner.
The above statement means - To use PutBucketVersioning operation to enable the versioning, you must be the owner of the bucket.
Use the below command to check the owner of the bucket. If you are the owner of the bucket, you should be able to set the versioning state as ENABLED / SUSPENDED.
aws s3api get-bucket-acl --bucket yourBucketName
Ok, notionquest is correct; however, it appears I also goofed up in my code by quoting a variable:
bucketVersioning = s3.BucketVersioning('bucket')
should be
bucketVersioning = s3.BucketVersioning(bucket)
Related
I have a specific use case where I want to upload an object to S3 at a specific prefix. A file already exists at that prefix and I want to replace that file with this new one. I am using boto3 to do the same and I am getting the following error. Bucket versioning is turned off and hence I am expecting the file to be overwritten in this case. However, I get the following error.
{
"errorMessage": "An error occurred (InvalidRequest) when calling the CopyObject operation: This copy request is illegal because it is trying to copy an object to itself without changing the object's metadata, storage class, website redirect location or encryption attributes.",
"errorType": "ClientError",
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 25, in lambda_handler\n s3.Object(bucket,product_key).copy_from(CopySource=bucket + '/' + product_key)\n",
" File \"/var/runtime/boto3/resources/factory.py\", line 520, in do_action\n response = action(self, *args, **kwargs)\n",
" File \"/var/runtime/boto3/resources/action.py\", line 83, in __call__\n response = getattr(parent.meta.client, operation_name)(*args, **params)\n",
" File \"/var/runtime/botocore/client.py\", line 386, in _api_call\n return self._make_api_call(operation_name, kwargs)\n",
" File \"/var/runtime/botocore/client.py\", line 705, in _make_api_call\n raise error_class(parsed_response, operation_name)\n"
]
}
This is what I have tried so far.
import boto3
import tempfile
import os
import tempfile
print('Loading function')
s3 = boto3.resource('s3')
glue = boto3.client('glue')
bucket='my-bucket'
bucket_prefix='my-prefix'
def lambda_handler(_event, _context):
my_bucket = s3.Bucket(bucket)
# Code to find the object name. There is going to be only one file.
for object_summary in my_bucket.objects.filter(Prefix=bucket_prefix):
product_key= object_summary.key
print(product_key)
#Using product_key variable I am trying to copy the same file name to the same location, which is when I get an error.
s3.Object(bucket,product_key).copy_from(CopySource=bucket + '/' + product_key)
# Maybe the following line is not required
s3.Object(bucket,bucket_prefix).delete()
I have a very specific reason to copy the same file at the same location. AWS GLue doesn't pick the same file once it's bookmarked it. My copying the file again I am hoping that the Glue bookmark will be dropped and the Glue job will consider this as a new file.
I am not too tied up with the name. If you can help me modify the above code to generate a new file at the same prefix level that would work as well. There always has to be one file here though. Consider this file as a static list of products that has been bought over from a relational DB into S3.
Thanks
From Tracking Processed Data Using Job Bookmarks - AWS Glue:
For Amazon S3 input sources, AWS Glue job bookmarks check the last modified time of the objects to verify which objects need to be reprocessed. If your input source data has been modified since your last job run, the files are reprocessed when you run the job again.
So, it seems your theory could work!
However, as the error message states, it is not permitted to copy an S3 object to itself "without changing the object's metadata, storage class, website redirect location or encryption attributes".
Therefore, you can add some metadata as part of the copy process and it will succeed. For example:
s3.Object(bucket,product_key).copy_from(CopySource=bucket + '/' + product_key, Metadata={'foo': 'bar'})
I have an audio file in S3.
I don't know the language of the audio file. So I need to use IdentifyLanguage for start_transcription_job().
LanguageCode will be blank since I don't know the language of the audio file.
Envirionment
Using
Python 3.8 runtime,
boto3 version 1.16.5 ,
botocore version: 1.19.5,
no Lambda Layer.
Here is my code for the Transcribe job:
mediaFileUri = 's3://'+ bucket_name+'/'+prefixKey
transcribe_client = boto3.client('transcribe')
response = transcribe_client.start_transcription_job(
TranscriptionJobName="abc",
IdentifyLanguage=True,
Media={
'MediaFileUri':mediaFileUri
},
)
Then I get this error:
{
"errorMessage": "Parameter validation failed:\nMissing required parameter in input: \"LanguageCode\"\nUnknown parameter in input: \"IdentifyLanguage\", must be one of: TranscriptionJobName, LanguageCode, MediaSampleRateHertz, MediaFormat, Media, OutputBucketName, OutputEncryptionKMSKeyId, Settings, ModelSettings, JobExecutionSettings, ContentRedaction",
"errorType": "ParamValidationError",
"stackTrace": [
" File \"/var/task/app.py\", line 27, in TranscribeSoundToWordHandler\n response = response = transcribe_client.start_transcription_job(\n",
" File \"/var/runtime/botocore/client.py\", line 316, in _api_call\n return self._make_api_call(operation_name, kwargs)\n",
" File \"/var/runtime/botocore/client.py\", line 607, in _make_api_call\n request_dict = self._convert_to_request_dict(\n",
" File \"/var/runtime/botocore/client.py\", line 655, in _convert_to_request_dict\n request_dict = self._serializer.serialize_to_request(\n",
" File \"/var/runtime/botocore/validate.py\", line 297, in serialize_to_request\n raise ParamValidationError(report=report.generate_report())\n"
]
}
With this error, means that I must specify the LanguageCode and IdentifyLanguage is an invalid parameter.
100% sure the audio file exist in S3. But without LanguageCode it don't work, and IdentifyLanguage parameter is unknown parameter
I using SAM application to test locally using this command:
sam local invoke MyHandler -e lambda\TheDirectory\event.json
And I cdk deploy, and check in Aws Lambda Console as well, tested it the same events.json, but still getting the same error
This I think is Lambda Execution environment, I didn't use any Lambda Layer.
I look at this docs from Aws Transcribe:
https://docs.aws.amazon.com/transcribe/latest/dg/API_StartTranscriptionJob.html
and this docs of boto3:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html#TranscribeService.Client.start_transcription_job
Clearly state that LanguageCode is not required and IdentifyLanguage is a valid parameter.
So what I missing out? Any idea on this? What should I do?
Update:
I keep searching and asked couple person online, I think I should build the function container 1st to let SAM package the boto3 into the container.
So what I do is, cdk synth a template file:
cdk synth --no-staging > template.yaml
Then:
sam build --use-container
sam local invoke MyHandler78A95900 -e lambda\TheDirectory\event.json
But still, I get the same error, but post the stack trace as well
[ERROR] ParamValidationError: Parameter validation failed:
Missing required parameter in input: "LanguageCode"
Unknown parameter in input: "IdentifyLanguage", must be one of: TranscriptionJobName, LanguageCode, MediaSampleRateHertz, MediaFormat, Media, OutputBucketName, OutputEncryptionKMSKeyId, Settings, JobExecutionSettings, ContentRedaction
Traceback (most recent call last):
File "/var/task/app.py", line 27, in TranscribeSoundToWordHandler
response = response = transcribe_client.start_transcription_job(
File "/var/runtime/botocore/client.py", line 316, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/var/runtime/botocore/client.py", line 607, in _make_api_call
request_dict = self._convert_to_request_dict(
File "/var/runtime/botocore/client.py", line 655, in _convert_to_request_dict
request_dict = self._serializer.serialize_to_request(
File "/var/runtime/botocore/validate.py", line 297, in serialize_to_request
raise ParamValidationError(report=report.generate_report())
Really no clue what I doing wrong here. I also report a github issue here, but seem like cant reproduce the issue.
Main Question/Problem:
Unable to start_transription_job
without LanguageCode
with IdentifyLanguage=True
What possible reason cause this, and how can I solve this problem(Dont know the languange of the audio file, I want to identify language of audio file without given the LanguageCode) ?
Check whether you are using the latest boto3 version.
boto3.__version__
'1.16.5'
I tried it and it works.
import boto3
transcribe = boto3.client('transcribe')
response = transcribe.start_transcription_job(TranscriptionJobName='Test-20201-27',IdentifyLanguage=True,Media={'MediaFileUri':'s3://BucketName/DemoData/Object.mp4'})
print(response)
{
"TranscriptionJob": {
"TranscriptionJobName": "Test-20201-27",
"TranscriptionJobStatus": "IN_PROGRESS",
"Media": {
"MediaFileUri": "s3://BucketName/DemoData/Object.mp4"
},
"StartTime": "datetime.datetime(2020, 10, 27, 15, 41, 2, 599000, tzinfo=tzlocal())",
"CreationTime": "datetime.datetime(2020, 10, 27, 15, 41, 2, 565000, tzinfo=tzlocal())",
"IdentifyLanguage": "True"
},
"ResponseMetadata": {
"RequestId": "9e4f94a4-20e4-4ca0-9c6e-e21a8934084b",
"HTTPStatusCode": 200,
"HTTPHeaders": {
"content-type": "application/x-amz-json-1.1",
"date": "Tue, 27 Oct 2020 14:41:02 GMT",
"x-amzn-requestid": "9e4f94a4-20e4-4ca0-9c6e-e21a8934084b",
"content-length": "268",
"connection": "keep-alive"
},
"RetryAttempts": 0
}
}
End up I notice this is because my packaged lambda function isn’t being uploaded for some reason. Here is how I solved it after getting help from couple of people.
First modify CDK stack which define my lambda function like this:
from aws_cdk import (
aws_lambda as lambda_,
core
)
from aws_cdk.aws_lambda_python import PythonFunction
class MyCdkStack(core.Stack):
def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
# define lambda
my_lambda = PythonFunction(
self, 'MyHandler',
entry='lambda/MyHandler',
index='app.py',
runtime=lambda_.Runtime.PYTHON_3_8,
handler='MyHandler',
timeout=core.Duration.seconds(10)
)
This will use aws-lambda-python module ,it will handle installing all required modules into the docker.
Next, cdk synth a template file
cdk synth --no-staging > template.yaml
At this point, it will bundling all the stuff inside entry path which define in PythonFunction and install all the necessary dependencies defined in requirements.txt inside that entry path.
Next, build the docker container
$ sam build --use-container
Make sure template.yaml file in root directory. This will build a docker container, and the artifact will build inside .aws-sam/build directory in my root directory.
Last step, invoke the function using sam:
sam local invoke MyHandler78A95900 -e path\to\event.json
Now finally successfully call start_transcription_job as stated in my question above without any error.
In Conclusion:
At the very beginning I only pip install boto3, this only will
install the boto3 in my local system.
Then, I sam local invoke without build the container 1st by sam build --use-container
Lastly, I have sam build at last, but in that point, I didn't
bundle what defined inside requirements.txt into the
.aws-sam/build, therefore need to use aws-lambda-python
module as stated above.
Hi I am trying to retrieve secrets using python SDK which will retrieve the secret name called test in the specified region's AWS secrets manager and getting the below error while running a simple python script:
#!/usr/bin/env python
import boto3
import base64
from botocore.exceptions import ClientError
def get_secret():
secret_name = "test"
region_name = "tesdas"
print("inside rds secrete")
# Create a Secrets Manager client
session = boto3.session.Session()
client = session.client(
service_name='secretsmanager',
region_name=region_name
)
# In this sample we only handle the specific exceptions for the 'GetSecretValue' API.
# See https://docs.aws.amazon.com/secretsmanager/latest/apireference/API_GetSecretValue.html
# We rethrow the exception by default.
try:
get_secret_value_response = client.get_secret_value(
SecretId=secret_name
)
# Decrypts secret using the associated KMS CMK.
# Depending on whether the secret is a string or binary, one of these fields will be populated.
if 'SecretString' in get_secret_value_response:
secret = get_secret_value_response['SecretString']
print("RDS Secret")
print(secret)
else:
decoded_binary_secret = base64.b64decode(get_secret_value_response['SecretBinary'])
get_secret()
Below is the error which I am getting, while executing the script:
Traceback (most recent call last):
File "test.py", line 60, in <module>
get_secret()
File "test.py", line 18, in get_secret
region_name=region_name
File "/usr/lib/python2.7/site-packages/boto3/session.py", line 263, in client
aws_session_token=aws_session_token, config=config)
File "/usr/lib/python2.7/site-packages/botocore/session.py", line 836, in create_client
client_config=config, api_version=api_version)
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 64, in create_client
service_model = self._load_service_model(service_name, api_version)
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 97, in _load_service_model
api_version=api_version)
File "/usr/lib/python2.7/site-packages/botocore/loaders.py", line 132, in _wrapper
data = func(self, *args, **kwargs)
File "/usr/lib/python2.7/site-packages/botocore/loaders.py", line 378, in load_service_model
known_service_names=', '.join(sorted(known_services)))
botocore.exceptions.UnknownServiceError: Unknown service: 'secretsmanager'. Valid service names are: acm, apigateway, application-autoscaling, appstream, athena, autoscaling, batch, budgets, clouddirectory, cloudformation, cloudfront, cloudhsm, cloudsearch, cloudsearchdomain, cloudtrail, cloudwatch, codebuild, codecommit, codedeploy, codepipeline, codestar, cognito-identity, cognito-idp, cognito-sync, config, cur, datapipeline, dax, devicefarm, directconnect, discovery, dms, ds, dynamodb, dynamodbstreams, ec2, ecr, ecs, efs, elasticache, elasticbeanstalk, elastictranscoder, elb, elbv2, emr, es, events, firehose, gamelift, glacier, greengrass, health, iam, importexport, inspector, iot, iot-data, kinesis, kinesisanalytics, kms, lambda, lex-models, lex-runtime, lightsail, logs, machinelearning, marketplace-entitlement, marketplacecommerceanalytics, meteringmarketplace, mturk, opsworks, opsworkscm, organizations, pinpoint, polly, rds, redshift, rekognition, resourcegroupstaggingapi, route53, route53domains, s3, sdb, servicecatalog, ses, shield, sms, snowball, sns, sqs, ssm, stepfunctions, storagegateway, sts, support, swf, waf, waf-regional, workdocs, workspaces, xray
It looks like #Narsireddy is correct. The line region_name = "tesdas" does not specify a valid region. The region should look something like "us-east-1" or "us-west-2".
The following code should enable versioning on a bucket/list of buckets, and then set the lifecycle configuration.
import boto3
# Create session
s3 = boto3.resource('s3')
s3Client = boto3.client('s3')
# Bucket list
buckets = ['BUCKETNAMEHERE']
# iterate through list of buckets
for bucket in buckets:
# Enable Versioning
bucketVersioning = s3.BucketVersioning(bucket)
bucketVersioning.enable()
# Configure Lifecycle
s3Client.put_bucket_lifecycle_configuration(
Bucket=bucket,
LifecycleConfiguration={
'Rules': [
{
'Status': 'Enabled',
'NoncurrentVersionTransitions': [
{
'NoncurrentDays': 7,
'StorageClass': 'GLACIER'
},
],
'NoncurrentVersionExpiration': {
'NoncurrentDays': 30
}
},
]
}
)
print "Versioning and lifecycle have been enabled for buckets."
However, whenever I run this I get the following error:
File "putVersioning.py", line 42, in <module>
'NoncurrentDays': 30
File "/home/user/.local/lib/python2.7/site-packages/botocore/client.py", line 253, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/user/.local/lib/python2.7/site-packages/botocore/client.py", line 557, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (MalformedXML) when calling the PutBucketLifecycleConfiguration operation: The XML you provided was not well-formed or did not validate against our published schema
As far as I can tell, everything looks correct?
According to the docs here you need to add Filter element, which is required as per Amazon API, and confusingly enough, not required by boto. I added the deprecated Prefix argument instead of the Filter and it seems to be working too.
This one works for me:
client.put_bucket_lifecycle_configuration(
Bucket=s3_bucket,
LifecycleConfiguration={
'Rules': [
{
'Expiration': {'Days': 5},
'Filter': {'Prefix': 'folder1/'},
'ID': 'id',
'Status': 'Enabled'
}
]
})
To see an actual Schema, create a new rule in S3, and then use client.get_bucket_lifecycle_configuration(Bucket=s3_bucket)
I want to delete multiple GCS keys wih Boto. In it's documentation it suggests that there's a multi-object delete method (delete_keys), however I cannot get it to work.
According to this article it is possible for Amazon S3:
s3 = boto.connect_s3()
bucket = s3.get_bucket("basementcoders.logging")
result = bucket.delete_keys([key.name for key in bucket if key.name[-1] == '6'])
result.deleted
However when i try the same thing for Google Storage it doesn't work:
bucket = BotoConnection().get_bucket(bucketName)
keys = [key for key in bucket]
print len(keys)
result = bucket.delete_keys(keys)
print result.deleted
print result.errors
Traceback (most recent call last):
File "gcsClient.py", line 166, in <module>
GcsClient.deleteMultipleObjects('debug_bucket')
File "gcsClient.py", line 155, in deleteMultipleObjects
result = bucket.delete_keys(keys)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/bucket.py", line 583, in delete_keys
while delete_keys2(headers):
File "/usr/local/lib/python2.7/dist-packages/boto/s3/bucket.py", line 582, in delete_keys2
body)
boto.exception.GSResponseError: GSResponseError: 400 Bad Request
This uses S3's multi-object delete API, which Google Cloud Storage does not support. Thus, it is not possible to do it this way for Google Cloud Storage - you will need to call delete_key () once per key.