I want to get the bucket policy for the various buckets. I tried the following code snippet(picked from the boto3 documentation):
conn = boto3.resource('s3')
bucket_policy=conn.BucketPolicy('demo-bucket-py')
print(bucket_policy)
But here's the output I get :
s3.BucketPolicy(bucket_name='demo-bucket-py')
What shall I rectify here ? Or is there some another way to get the access policy for s3 ?
Try print(bucket_policy.policy). More information on that here.
this worked for me
import boto3
# Create an S3 client
s3 = boto3.client('s3')
# Call to S3 to retrieve the policy for the given bucket
result = s3.get_bucket_policy(Bucket='my-bucket')
print(result)
to perform this you need to configure or mention your keys like this s3=boto3.client("s3",aws_access_key_id=access_key_id,aws_secret_access_key=secret_key). BUT there is much better way to do this is by using aws configure command and enter your credentials. for setting up docs. Once you set up you wont need to enter your keys again in your code, boto3 or aws cli will automatically fetch it behind the scenes .https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html.
you can even set different profiles to work with different accounts
Related
Context: I have configured right aws-access-key and aws-secret-key
I can see buckets contents on aws-console but on aws-cli
Here's my boto3 code
import boto3
# Enter the name of your S3 bucket here
bucket_name = 'xxxx'
# Enter the name of the region where your S3 bucket is located
region_name = 'ap-southeast-1'
# Create an S3 client
s3 = boto3.client('s3', region_name=region_name)
# List all the objects in the bucket
objects = s3.list_objects(Bucket=bucket_name)
# Print the names of all the objects in the bucket
for object in objects['Contents']:
print(object['Key'])
I have "s3:List*" under my AWS-policy. What am I missing?
I am trying list all buckets using aws-cli it works using aws-console but not cli. I have rechecked my aws-secret/acess key, everything is right.
EDIT: aws-cli throws error
An error occurred (AccessDenied) when calling the ListBuckets operation: Access Denied
Usually this kind of error happens if you have multiple AWS profiles in your PC and you are using the wrong profile to make a call a call to AWS
Make sure the default profile is the profile that has access to the AWS account.
You can also use aws s3 ls --PROFILE_NAME to get the list of buckets if you have multiple profiles
Run aws sts get-caller identity to get the current caller identity
From ListBuckets - Amazon Simple Storage Service:
Returns a list of all buckets owned by the authenticated sender of the request. To use this operation, you must have the s3:ListAllMyBuckets permission.
The wording is a bit confusing, but:
ListBuckets returns a list of the names of S3 buckets in your AWS Account
ListObjects returns a list of objects in a particular S3 bucket
Your Python code is calling list_objects().
The AWS CLI error is saying ListBuckets operation: Access Denied, which suggests that you are trying to obtain a list of buckets, rather than a list of objects.
I am currently learning about boto3 and how it can interact with AWS to connect using both the client and resources methods. I was made to understand that it doesnt matter which one I use that I can still get access except in some cases where I would need to access some client features that are not available through the resources medium hence I would specifiy the created resource variable i.e from
import boto3
s3_resource = boto3.resource('s3')
Hence if there is a need for me to access some client features, I would simply specify
s3.resource.meta.client
But the main issue here is, I tried creating clients/resources first for EC2, S3, IAM, and Redshift so I did this
import boto3
ec2 = boto3.resource('ec2',
region_name='us-west-2',
aws_access_key_id=KEY,
aws_secret_access_key=SECRET)
s3 = boto3.resource('s3',
region_name='us-west-2',
aws_access_key_id=KEY,
aws_secret_access_key=SECRET)
iam = boto3.client('iam',
region_name='us-west-2',
aws_access_key_id=KEY,
aws_secret_access_key=SECRET)
redshift = boto3.resource('redshift',
region_name='us-west-2',
aws_access_key_id=KEY,
aws_secret_access_key=SECRET)
But I get this error
UnknownServiceError: Unknown service: 'redshift'. Valid service names are: cloudformation, cloudwatch, dynamodb, ec2, glacier, iam, opsworks, s3, sns, sqs
During handling of the above exception, another exception occurred:
...
- s3
- sns
- sqs
Consider using a boto3.client('redshift') instead of a resource for 'redshift'
Please why is that, I thought I could create using the commands that I specified, Please help
I suggest that you consult the Boto3 documentation for Amazon Redshift. It does, indeed, show that there is no resource method for Redshift (or Redshift Data API, or Redshift Serverless).
Also, I recommend against using aws_access_key_id and aws_secret_access_key in your code unless there is a specific need (such as extracting them from Environment Variables). It is better to use the AWS CLI aws configure command to store AWS credentials in a configuration file, which will be automatically accessed by AWS SDKs such as boto3.
How do I write lambda function in AWS(python) to delete the contents of S3 buckets. please share the template on this regard I just want the codes.
This will help you in setting up a Python based lambda - including entry point for handler:
https://stackify.com/aws-lambda-with-python-a-complete-getting-started-guide/
Once that is figured out, you need to create an s3 client using:
import boto3
client = boto3.client("s3")
Then you can follow the user guide to empty bucket:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/empty-bucket.html
Also note that the lambda is run using an assumed role, please make sure the IAM role has relevant permissions:
https://docs.aws.amazon.com/lambda/latest/dg/lambda-intro-execution-role.html
I'm an AWS newbie trying to use Textract API, their OCR service.
As far as I understood I need to upload files to a S3 bucket and then run textract on it.
I got the bucket on and the file inside it:
I got the permissions:
But when I run my code it bugs.
import boto3
import trp
# Document
s3BucketName = "textract-console-us-east-1-057eddde-3f44-45c5-9208-fec27f9f6420"
documentName = "ok0001_prioridade01_x45f3.pdf"
]\[\[""
# Amazon Textract client
textract = boto3.client('textract',region_name="us-east-1",aws_access_key_id="xxxxxx",
aws_secret_access_key="xxxxxxxxx")
# Call Amazon Textract
response = textract.analyze_document(
Document={
'S3Object': {
'Bucket': s3BucketName,
'Name': documentName
}
},
FeatureTypes=["TABLES"])
Here is the error I get:
botocore.errorfactory.InvalidS3ObjectException: An error occurred (InvalidS3ObjectException) when calling the AnalyzeDocument operation: Unable to get object metadata from S3. Check object key, region and/or access permissions.
What am I missing? How could I solve that?
You are missing S3 access policy, you should add AmazonS3ReadOnlyAccess policy if you want a quick solution according to your needs.
A good practice is to apply the least privilege access principle and keep granting access when needed. So I'd advice you to create a specific policy to access your S3 bucket textract-console-us-east-1-057eddde-3f44-45c5-9208-fec27f9f6420 only and only in us-east-1 region.
Amazon Textract currently supports PNG, JPEG, and PDF formats. Looks like you are using PDF.
Once you have a valid format, you can use the Python S3 API to read the data of the object in the S3 object. Once you read the object, you can pass the byte array to the analyze_document method. TO see a full example of how to use the AWS SDK for Python (Boto3) with Amazon Textract to
detect text, form, and table elements in document images.
https://github.com/awsdocs/aws-doc-sdk-examples/blob/master/python/example_code/textract/textract_wrapper.py
Try following that code example to see if your issue is resolved.
"Could you provide some clearance on the params to use"
I just ran the Java V2 example and it works perfecly. In this example, i am using a PNG file located in a specific Amazon S3 bucket.
Here are the parameters that you need:
Make sure when implementing this in Python, you set the same parameters.
I have a usecase to use AWS Lambda to copy files/objects from one S3 bucket to another. In this usecase Source S3 bucket is in a separate AWS account(say Account 1) where the provider has only given us AccessKey & SecretAccess Key. Our Lambda runs in Account 2 and the destination bucket can be either in Account 2 or some other account 3 altogether which can be accessed using IAM role. The setup is like this due to multiple partner sharing data files
Usually, I used to use the following boto3 command to copy the contents between two buckets when everything is in the same account but want to know how this can be modified for the new usecase
copy_source_object = {'Bucket': source_bucket_name, 'Key': source_file_key}
s3_client.copy_object(CopySource=copy_source_object, Bucket=destination_bucket_name, Key=destination_file_key)
How can the above code be modified to fit my usecase of having accesskey based connection to source bucket and roles for destination bucket(which can be cross-account role as well)? Please let me know if any clarification is required
There's multiple options here. Easiest is by providing credentials to boto3 (docs). I would suggest retrieving the keys from the SSM parameter store or secrets manager so they're not stored hardcoded.
Edit: I realize the problem now, you can't use the same session for both buckets, makes sense. The exact thing you want is not possible (ie. use copy_object). The trick is to use 2 separate session so you don't mix the credentials. You would need to get_object from the first account and put_object to the second objects. You should be able to simply put the resp['Body'] from the get in the put request but I haven't tested this.
import boto3
acc1_session = boto3.session.Session(
aws_access_key_id=ACCESS_KEY_acc1,
aws_secret_access_key=SECRET_KEY_acc1
)
acc2_session = boto3.session.Session(
aws_access_key_id=ACCESS_KEY_acc2,
aws_secret_access_key=SECRET_KEY_acc2
)
acc1_client = acc1_session.client('s3')
acc2_client = acc2_session.client('s3')
copy_source_object = {'Bucket': source_bucket_name, 'Key': source_file_key}
resp = acc1_client.get_object(Bucket=source_bucket_name, Key=source_file_key)
acc2_client.put_object(Bucket=destination_bucket_name, Key=destination_file_key, Body=resp['Body'])
Your situation appears to be:
Account-1:
Amazon S3 bucket containing files you wish to copy
You have an Access Key + Secret Key from Account-1 that can read these objects
Account-2:
AWS Lambda function that has an IAM Role that can write to a destination bucket
When using the CopyObject() command, the credentials used must have read permission on the source bucket and write permission on the destination bucket. There are normally two ways to do this:
Use credentials from Account-1 to 'push' the file to Account-2. This requires a Bucket Policy on the destination bucket that permits PutObject for the Account-1 credentials. Also, you should set ACL= bucket-owner-full-control to handover control to Account-2. (This sounds similar to your situation.) OR
Use credentials from Account-2 to 'pull' the file from Account-1. This requires a Bucket Policy on the source bucket that permits GetObject for the Account-2 credentials.
If you can't ask for a change to the Bucket Policy on the source bucket that permits Account-2 to read the contents, then **you'll need a Bucket Policy on the Destination bucket that permits write access by the credentials from Account-1`.
This is made more complex by the fact that you are potentially copying the object to a bucket in "some other account". There is no easy answer if you are starting to use 3 accounts in the process.
Bottom line: If possible, ask them for a change to the source bucket's Bucket Policy so that your Lambda function can read the files without having to change credentials. It can then copy objects to any bucket that the function's IAM Role can access.