I am planning to develop a web application which can perform some basic text edit functions (like insert and delete) on S3 files. Could anyone show me a path forward? I am currently learning Lambda, and have followed tutorial here: http://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html
I can create a Lambda function which can modify files on S3, and call the function by AWS CLI now. What else do I need to know and do to create this web application? Thank you very much.
You would need to look at AWS API Gateway. This can be the front end to your web application.
Also note that S3 is a block storage mechanism, and if your file edits are too frequent it is not suitable for your use case because every time you want to edit the text you will have to download the entire file, modify that and upload that back again. And be mindful of the S3 eventual consistency
Amazon S3 Data Consistency Model
Amazon S3 provides read-after-write consistency for PUTS of new objects in your S3 bucket in all regions with one caveat. The caveat is that if you make a HEAD or GET request to the key name (to find if the object exists) before creating the object, Amazon S3 provides eventual consistency for read-after-write.
Related
What AWS service is appropriate for storing a single key-value pair data that is updated daily? The stored data will be retrieved by other several services throughout the day (~ 100 times total per day).
My current solution is to create and upload a JSON to an S3 bucket. All other services download the JSON and get the data. When it's time to update the data, I create a new JSON and upload it to replace the previously uploaded JSON. This works pretty well but I'm wondering if there is a more appropriate way.
There's many:
AWS Systems Manager Parameter Store
AWS Secrets Manager
Dynamo
S3
^ those are some of the most common. Without knowing more I'd suggest you consider Dynamo or Param Store. Both are simple and inexpensive--although S3 is fine, too.
The only reason to not use S3 is governance of the key expires etc., automatically from AWS side - like using a secret manager - therefore, giving it to third parties will be much harder.
Your solution seems very good, especially since S3 IS the object store database - json is an object.
The system you described is such a low usage that you shouldn't spend time thinking if there is any better way :)
Just make sure you are aware that amazon S3 provides read-after-write consistency for PUTS of new objects in your S3 bucket in all regions with one caveat. The caveat is that if you make a HEAD or GET request to the key name (to find if the object exists) before creating the object, Amazon S3 provides eventual consistency for read-after-write
and to refer to your comment:
The S3 way seemed a little hacky, so I am trying to see if there is a better approach
S3 way is not hacky at all - intended use of S3 is to store some objects in the key-value database :)
As i'm New on aws and a little confused by all the similar services, I would like to have some leads and know if I am in the right direction.
I have tar.gz archives stored on aws s3 glacier deep archives. I would like that when requesting a restore, the archive is automatically extracted and the folders and files it contains put in s3 (with an expiration date).
these archives are too big to be extracted via lambda (300GB or more).
My idea would be to trigger a lambda function when the restore is complete and use that lambda function to start another aws service that does the extraction. I was thinking either aws batch or fargate. Which service do you think is the most suitable? For this kind of simple task it is preferable to use an arm architecture?
If someone has already done this before and has codes to share I'm interested (if not I'll try to put my final solution here for others).
I use the Illumina Basespace service to do high throughput sequencing secondary analyzes. This service uses AWS servers and therefore all files are stored on s3.
I would like to transfer the files (results of analyzes) from basespace to my own aws s3 account. I would like to know what would be the best strategy to make things go quickly knowing that in the end we can summarize it as a copy of files from an s3 bucket belonging to Illumina to an s3 bucket belonging to me.
The solutions I'm thinking of:
use the CLI basespace tool to copy the files to our on premise servers then transfer them back to aws
use this tool from an ec2 instance.
use the illumina API to get a pre-signed download url (but then how can I use this url to download the file directly into my s3 bucket?).
If I use an ec2 instance, what kind of instance do you recommend to have enough resources without having too much (and therefore spending money for nothing)?
Thanks in advance,
Quentin
I'm new to AWS and have a feasibility question for a file management system I'm trying to build. I would like to set up a system where people will use the Amazon S3 browser and drop either a csv or excel file into their specific bucket. Then I would like to automate the process of taking that csv/excel file and inserting that into a table within RDS. Now this is assuming that the table has already been built and those excel/csv file will always be formatted the same and will be in the same exact place every single time. Is it possible to automate this process or at least get it to point where very minimal human interference is needed. I'm new to AWS so I'm not exactly sure of the limits of S3 to RDS. Thank you in advance.
It's definitely possible. AWS supports notifications from S3 to SNS, which can be forwarded automatically to SQS: http://aws.amazon.com/blogs/aws/s3-event-notification/
S3 can also send notifications to AWS Lambda to run your own code directly.
Gone through amazon SDK/documentation and there isn't a lot around programtically querying/searching for documents on S3 bucket.
Sure, can get document by id/name but i want to have ability to search by other meta tags such as author.
Would appreciate some guidance and a specific example of a query being executed and not a local iteration once all documents or items have been pulled locally.
[…] there isn't a lot around programtically querying/searching for documents on S3 bucket.
Right. S3 is flat file storage, and doesn't provide a query interface.
[…] i want to have ability to search by other meta tags such as author.
This will need to be solved by your application logic. This is not built-in to S3.
For example, you can store the metadata about an S3 document/file in DynamoDB. You query DynamoDB for the metadata, which includes a pointer to the file in S3.
Unfortunately, if you already have a bunch of files in S3, you'll need to find a way to build that initial index of your data.
Amazon just released new features for cloud search
http://aws.amazon.com/about-aws/whats-new/2014/03/24/amazon-cloudsearch-introduces-powerful-new-search-and-admin-features/.