Problem
I have multiple files in the same S3 bucket. When I try to load one file into Snowflake, I get a "access denied" error. When I try a different file (in the same bucket), I can successfully load into Snowflake.
The file highlighted does not load into Snowflake.
This is the error
Using a different file but in the same bucket, I can successfully load into Snowflake.
Known Difference: The file that does not work was generated by AWS. The file that can be loaded into Snowflake was generated by AWS, saved to my local then reuploaded to the bucket.
The only difference is I brought it down to my local machine.
Question: Is there a known file permission on parquet files? Why does this behavior go away when I download and upload to the same bucket.
It cannot be an S3 bucket issue. It has to be some encoding on the parquet file.
Question: Is there a known file permission on parquet files? Why does
this behavior go away when I download and upload to the same bucket.
It cannot be an S3 bucket issue. It has to be some encoding on the
parquet file.
You are making some bad assumptions here. Each S3 object can have separate ACL (permission) values. You need to check what the ACL settings are by drilling down to view the details of each of those objects in S3. My guess is AWS is writing the objects to S3 with a private ACL, and when you re-uploaded one of them to the bucket you saved it with a public ACL.
Turns out I needed to add KMS permissions to the user accessing the file.
Related
Just my thinking. Some of us may work on several files and frequently upload the same file with the same name onto Amazon S3. By default, the permission will be reset. Assuming that I don't use Versioning.
And I have a need to keep the same permission for any uploaded file which has the same name file existed on current Amazon S3.
I know it may not a good idea but technically how we can realize it?
Thanks
It is not possible to upload an object and request that the existing ACL settings be kept on the new object.
Instead, you should specify the ACL when the object is uploaded.
Task: Load 13TB (100s of files) from an external s3 bucket to company s3 bucket
Given: .pem and .ppk file, s3 hostname, username
Done so far: Able to view files via FileZilla/WinSCP using provided .pem/.ppk file, hostname and username
Requirements: Determine best way to load these hundreds of .gz files from an external vendor's s3 bucket to my company's s3 bucket, preserving the same structure. And then loading the same into snowflake from internal s3. Options being considered: AWS Snowball, python, Volume --> s3, python to load into snowflake from s3.
I am unsure how to proceed. Any input?
If the vendor is willing to work with you and S3 is the final destination, I would ask if they could set replication to your bucket.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/replication.html
I have uploaded a file to S3 and while uploading a file I have also specified some metadata (x-amz-meta-metdata). Although when I Sync S3 Bucket using aws's DataSync metadata is missing (RightClick file-->Properties-->Details tab in opened popup).
Please let me know how can achieve this task or is it possible to save metadata while saving s3 file to local machine.
Thanks
My application runs on the client PC. It produces log files including error reports and user's action.
To collect and analyze log files, I try to upload log files to Amazon S3 from the client PC.
But is it safe? My app has no authentication so that users can upload unlimited number of files. I am concerned with maricious user upload a fake error report and huge file. I'd like s3 bucket not to exceed free quota. Is there any best practice for this task?
Just make sure that the files you are uploading to Amazon S3 are kept as Private and the Amazon S3 bucket is kept as private. These are the default settings and are enforced by Amazon S3 block public access unless somebody has specifically changed the settings.
With this configuration, the files are only accessible to people with AWS credentials that have been granted permission to access the S3 bucket.
Additionally to John's answer you can use AWS KMS (https://aws.amazon.com/kms/?nc1=h_ls) to encrypt your data at rest.
With regards of the file size, you should limit the size of the uploaded file in your application I would say.
Objects and files names in my S3 bucket changed from my selected names to those displayed in the screenshot below.. And now when I update a file, it uploads successfully but doesn't change, the date modified is not changed neither are the changes in the codes are visible on the web page. Can someone please help me find out what happens to this bucket and how can I fix it?
The files you are showing are created by Amazon S3 bucket logging, which creates log files of access requests to Amazon S3.
Logging is activated within the Properties panel of your bucket, where you can nominate a target bucket and prefix for the logs.
So, your files are not being renamed. Rather, they are additional log files that are generated by Amazon S3.
If they are in the same location as your files, things will get confusing! Your files are still in there, but probably later in the naming scheme.
I would recommend:
Go into the bucket's properties
If you do not need the logs, then disable bucket logging
If you wish to keep the logs, configure them to write to a different bucket, or the same bucket but with a prefix (directory)
Delete or move the existing log files so that you will be left with just your non-log files