How do I write lambda function in AWS(python) to delete the contents of S3 buckets. please share the template on this regard I just want the codes.
This will help you in setting up a Python based lambda - including entry point for handler:
https://stackify.com/aws-lambda-with-python-a-complete-getting-started-guide/
Once that is figured out, you need to create an s3 client using:
import boto3
client = boto3.client("s3")
Then you can follow the user guide to empty bucket:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/empty-bucket.html
Also note that the lambda is run using an assumed role, please make sure the IAM role has relevant permissions:
https://docs.aws.amazon.com/lambda/latest/dg/lambda-intro-execution-role.html
Related
I am trying to create a simple infrastructure using terraform.Terraform should create the lambda and s3 bucket, lambda is triggered using API gateway which is again created terraform.
I have created a role and assigned that to lambda so that lambda can put objects in s3 bucket.
My lambda is written in java, since I am assigning role to lambda to access S3, how do I use that role in my code?
I came across another article which suggested accessing S3 using the below code. I assumed the token generation would be taken care of this.
var s3Client = AmazonS3ClientBuilder.standard()
.withCredentials(InstanceProfileCredentialsProvider(false))
.withRegion("ap-southeast-2")
.build()
I am confused as to how to access s3, do I need to use the role created by terraform in my code or is there a different way to access S3 from java code?
You don't need to assume the role inside the Lambda function. Instead, simply configure the Lambda function to assume the IAM role. Or add the relevant S3 policy to the Lambda's existing IAM role.
You don't typically have to supply credentials or region explicitly in this case. Simply use:
AmazonS3 s3Client = new AmazonS3Client();
See the Terraform basic example of creating both an IAM role and a Lambda function, and configuring the Lambda function to assume the configured role.
Jarmods answer is correct that you can configure the role of the Lambda directly - but there are particular use cases where you may need to be first in one account, than the other. If you need to assume a role in the middle of your code, then use the STS functionality of your SDK. STS is the library in the aws sdk that controls assuming a role's credentials through code.
I'm an AWS newbie trying to use Textract API, their OCR service.
As far as I understood I need to upload files to a S3 bucket and then run textract on it.
I got the bucket on and the file inside it:
I got the permissions:
But when I run my code it bugs.
import boto3
import trp
# Document
s3BucketName = "textract-console-us-east-1-057eddde-3f44-45c5-9208-fec27f9f6420"
documentName = "ok0001_prioridade01_x45f3.pdf"
]\[\[""
# Amazon Textract client
textract = boto3.client('textract',region_name="us-east-1",aws_access_key_id="xxxxxx",
aws_secret_access_key="xxxxxxxxx")
# Call Amazon Textract
response = textract.analyze_document(
Document={
'S3Object': {
'Bucket': s3BucketName,
'Name': documentName
}
},
FeatureTypes=["TABLES"])
Here is the error I get:
botocore.errorfactory.InvalidS3ObjectException: An error occurred (InvalidS3ObjectException) when calling the AnalyzeDocument operation: Unable to get object metadata from S3. Check object key, region and/or access permissions.
What am I missing? How could I solve that?
You are missing S3 access policy, you should add AmazonS3ReadOnlyAccess policy if you want a quick solution according to your needs.
A good practice is to apply the least privilege access principle and keep granting access when needed. So I'd advice you to create a specific policy to access your S3 bucket textract-console-us-east-1-057eddde-3f44-45c5-9208-fec27f9f6420 only and only in us-east-1 region.
Amazon Textract currently supports PNG, JPEG, and PDF formats. Looks like you are using PDF.
Once you have a valid format, you can use the Python S3 API to read the data of the object in the S3 object. Once you read the object, you can pass the byte array to the analyze_document method. TO see a full example of how to use the AWS SDK for Python (Boto3) with Amazon Textract to
detect text, form, and table elements in document images.
https://github.com/awsdocs/aws-doc-sdk-examples/blob/master/python/example_code/textract/textract_wrapper.py
Try following that code example to see if your issue is resolved.
"Could you provide some clearance on the params to use"
I just ran the Java V2 example and it works perfecly. In this example, i am using a PNG file located in a specific Amazon S3 bucket.
Here are the parameters that you need:
Make sure when implementing this in Python, you set the same parameters.
I want to get the bucket policy for the various buckets. I tried the following code snippet(picked from the boto3 documentation):
conn = boto3.resource('s3')
bucket_policy=conn.BucketPolicy('demo-bucket-py')
print(bucket_policy)
But here's the output I get :
s3.BucketPolicy(bucket_name='demo-bucket-py')
What shall I rectify here ? Or is there some another way to get the access policy for s3 ?
Try print(bucket_policy.policy). More information on that here.
this worked for me
import boto3
# Create an S3 client
s3 = boto3.client('s3')
# Call to S3 to retrieve the policy for the given bucket
result = s3.get_bucket_policy(Bucket='my-bucket')
print(result)
to perform this you need to configure or mention your keys like this s3=boto3.client("s3",aws_access_key_id=access_key_id,aws_secret_access_key=secret_key). BUT there is much better way to do this is by using aws configure command and enter your credentials. for setting up docs. Once you set up you wont need to enter your keys again in your code, boto3 or aws cli will automatically fetch it behind the scenes .https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html.
you can even set different profiles to work with different accounts
I have a usecase to use AWS Lambda to copy files/objects from one S3 bucket to another. In this usecase Source S3 bucket is in a separate AWS account(say Account 1) where the provider has only given us AccessKey & SecretAccess Key. Our Lambda runs in Account 2 and the destination bucket can be either in Account 2 or some other account 3 altogether which can be accessed using IAM role. The setup is like this due to multiple partner sharing data files
Usually, I used to use the following boto3 command to copy the contents between two buckets when everything is in the same account but want to know how this can be modified for the new usecase
copy_source_object = {'Bucket': source_bucket_name, 'Key': source_file_key}
s3_client.copy_object(CopySource=copy_source_object, Bucket=destination_bucket_name, Key=destination_file_key)
How can the above code be modified to fit my usecase of having accesskey based connection to source bucket and roles for destination bucket(which can be cross-account role as well)? Please let me know if any clarification is required
There's multiple options here. Easiest is by providing credentials to boto3 (docs). I would suggest retrieving the keys from the SSM parameter store or secrets manager so they're not stored hardcoded.
Edit: I realize the problem now, you can't use the same session for both buckets, makes sense. The exact thing you want is not possible (ie. use copy_object). The trick is to use 2 separate session so you don't mix the credentials. You would need to get_object from the first account and put_object to the second objects. You should be able to simply put the resp['Body'] from the get in the put request but I haven't tested this.
import boto3
acc1_session = boto3.session.Session(
aws_access_key_id=ACCESS_KEY_acc1,
aws_secret_access_key=SECRET_KEY_acc1
)
acc2_session = boto3.session.Session(
aws_access_key_id=ACCESS_KEY_acc2,
aws_secret_access_key=SECRET_KEY_acc2
)
acc1_client = acc1_session.client('s3')
acc2_client = acc2_session.client('s3')
copy_source_object = {'Bucket': source_bucket_name, 'Key': source_file_key}
resp = acc1_client.get_object(Bucket=source_bucket_name, Key=source_file_key)
acc2_client.put_object(Bucket=destination_bucket_name, Key=destination_file_key, Body=resp['Body'])
Your situation appears to be:
Account-1:
Amazon S3 bucket containing files you wish to copy
You have an Access Key + Secret Key from Account-1 that can read these objects
Account-2:
AWS Lambda function that has an IAM Role that can write to a destination bucket
When using the CopyObject() command, the credentials used must have read permission on the source bucket and write permission on the destination bucket. There are normally two ways to do this:
Use credentials from Account-1 to 'push' the file to Account-2. This requires a Bucket Policy on the destination bucket that permits PutObject for the Account-1 credentials. Also, you should set ACL= bucket-owner-full-control to handover control to Account-2. (This sounds similar to your situation.) OR
Use credentials from Account-2 to 'pull' the file from Account-1. This requires a Bucket Policy on the source bucket that permits GetObject for the Account-2 credentials.
If you can't ask for a change to the Bucket Policy on the source bucket that permits Account-2 to read the contents, then **you'll need a Bucket Policy on the Destination bucket that permits write access by the credentials from Account-1`.
This is made more complex by the fact that you are potentially copying the object to a bucket in "some other account". There is no easy answer if you are starting to use 3 accounts in the process.
Bottom line: If possible, ask them for a change to the source bucket's Bucket Policy so that your Lambda function can read the files without having to change credentials. It can then copy objects to any bucket that the function's IAM Role can access.
Is it possible to configure S3 bucket to run a Lambda function created in a different account? Basically what I'm trying to accomplish is that when new items are added to S3 bucket I want to run a lambda function in another account
You can do this by providing the full Lambda Function ARN to your S3 bucket. For example inside your bucket settings in the AWS Console:
This article will help you configure the correct IAM for cross account invocation. Also take a look at the AWS Lambda Permissions Model. Note that as far as I know the bucket and the Lambda function have to be in the same region!